The Research Database Complex (RDC)
On this page:
- System overview
- System information
- Working with ePHI research data
- System access
- Computing environment
- Transferring your files to the RDC
- Application development
- Acknowledging grant support
The Indiana University Research Database Complex (
rdc.uits.iu.edu) supports research-related databases and data-intensive applications that require databases. The RDC supports Oracle and MySQL databases, and provides an environment (
rdcweb.uits.iu.edu) for database-driven web applications focusing on research.
The system runs Red Hat Enterprise Linux 5 (RHEL 5). User home directories reside on a network-attached storage (NAS) device. You have a 10 GB (default) disk quota, which is shared with Big Red II, Quarry, and Mason, if you have accounts on those systems.
The RDC currently supports several important research projects:
- Collaborative Initiative on Fetal Alcohol Spectrum Disorders
- Indiana Spatial Data Service
- National Gene Vector Laboratories
Note: The RDC is strictly devoted to supporting research; it is not an instructional or classroom environment. If you need to use Oracle or Microsoft SQL Server in an instructional environment see Instructional database and web server accounts.
The RDC offers Oracle 11g Release 2 (version 18.104.22.168) and MySQL Enterprise Server (version 5.5.8 Advanced) database accounts, with a full suite of Oracle components that support:
Advanced Security Option (ASO): Provides data
encryption and strong authentication services to the Oracle
Application Express: Offers development and
deployment of secure applications through a rapid, web-application
development tool for use with Oracle databases
Large objects (LOBs): Lets you store and
manipulate large blocks of unstructured data, such as text, graphic
images, video clips, and sound waveforms, in binary or
Oracle Multimedia (formerly Oracle interMedia):
Provides a platform for a wide range of multimedia-intensive
Oracle Text: Indexes any document or textual
content to add fast, accurate retrieval of information to Internet
content management applications, e-business catalogs, news services,
and job postings; indexes content stored in file systems, databases,
or on the web
Oracle XML Database: Treats XML as a
native datatype in the database
Oracle XDK: Contains the basic building blocks
for reading, manipulating, transforming, and viewing XML documents,
whether on a file system or stored in a database
- Partitioning: Lets you split large tables and indexes into smaller, manageable components, without requiring changes to underlying applications
Oracle Data Mining (ODM): Provides a way to
access information buried in the data by creating models to find
hidden patterns in large, complex collections of data; embeds data
mining within the Oracle database; algorithms operate natively on
relational tables or views, eliminating the need to extract and
transfer data into other tools, applications, or servers
- Online Analytical Processing (OLAP): Offers in-database, advanced multidimensional analytic capabilities
JServer Java Virtual Machine: A Java virtual
machine (VM) that runs within the Oracle database server's address
- Oracle Database Java Packages: Classes for relational database management system (RDBMS) features
Note: The RDC maintenance window is the first Tuesday of each month, 8am-5pm. Notice of any emergency downtime will be posted at IT Notices.
|System configuration||Aggregate information||Per-node information|
|Machine type||Research database system|
|Operating system||Red Hat Enterprise Linux 5 (RHEL 5)|
|Memory model||Distributed and shared|
|CPUs||Intel Xeon E5620 2.40 GHz (HP)
Intel Xeon Quad Core 1.6 GHz (Dell)
|Nodes||3 Hewlett Packard DL 180 G6 Oracle servers
1 Hewlett Packard DL 180 G6 MySQL server
1 Dell 2950 Database Driven Web Services
|RAM||288 GB||72 GB (HP)
8 GB (Dell)
|RPeak||307.2 gigaFLOPS||76.8 gigaFLOPS|
|Database storage and file systems||Oracle and MySQL storage is provided via RAID 6 volumes
(block-level striping with double distributed parity) on a fiber
channel array, currently 48 TB in size.
Home directories with 10 GB default quotas are provided via NAS over NFS.
Shared scratch space is hosted on the Data Capacitor II (DC2) file
system. The DC2 scratch directory is a temporary workspace. Scratch
space is not allocated, and its total capacity fluctuates based on
project space requirements. The DC2 file system is mounted on IU
research systems as
Note: The Data Capacitor II (DC2) high-speed,
high-capacity, storage facility for very large data sets replaces the
former Data Capacitor file system, which was decommissioned January 7,
2014. The DC2 scratch file system (
|Backup and purge policies||Incremental backups of the RDC Oracle databases occur
at various times between 1am and 6am, Sunday through Friday.
Full backups occur 1am-5am every Saturday. Backups are retained for 30 days.
Backups for MySQL database servers on the RDC are the responsibility of the user.
Working with ePHI research data
The Health Insurance Portability and Accountability Act of 1996 (HIPAA) established rules protecting the privacy and security of personal health data. The HIPAA Security Rule set national standards specifically for the security of protected health information (PHI) that is created, stored, transmitted, or received electronically (i.e., electronic protected health information, or ePHI). To ensure the confidentiality, integrity, and availability of ePHI data, the HIPAA Security Rule requires organizations and individuals to implement a series of administrative, physical, and technical safeguards when working with ePHI data.
Although you can use this system for processing or storing electronic protected health information (ePHI) related to official IU research:
- You and/or the project's principal investigator (PI) are
responsible for ensuring the privacy and security of that data, and
complying with applicable federal and state laws/regulations and
institutional policies. IU's policies regarding HIPAA compliance
require the appropriate Institutional Review Board (IRB) approvals and
a data management plan.
- You and/or the project's PI are responsible for implementing HIPAA-required administrative, physical, and technical safeguards to any person, process, application, or service used to collect, process, manage, analyze, or store ePHI data.
The UITS Advanced Biomedical IT Core (ABITC) provides consulting and online help for Indiana University researchers who need help securely processing, storing, and sharing ePHI research data. If you need help or have questions about managing HIPAA-regulated data at IU, contact the ABITC. For additional details about HIPAA compliance at IU, see HIPAA & ABITC and the Office of Vice President and General Counsel (OVPGC) HIPAA Privacy & Security page.
Important: Although UITS HIPAA-aligned resources are managed using standards surpassing official standards for managing institutional data at IU and are appropriate for storing HIPAA-regulated ePHI research data, they are not recognized by the IU Committee of Data Stewards as appropriate for storing institutional data elements classified as Critical that are not ePHI data. For help determining which institutional data elements classified as Critical are considered ePHI, see Which data elements in the classifications of institutional data are considered protected health information (PHI)?
The IU Committee of Data Stewards and the University Information Policy Office (UIPO) set official classification levels and data management standards for institutional data in accordance with the university's Management of Institutional Data policy. If you have questions about the classifications of institutional data, contact the appropriate Data Steward. To determine the most sensitive classification of institutional data you can store on any given UITS service, see the "Choosing an appropriate storage solution" section of At IU, which dedicated file storage services and IT services with storage components are appropriate for sensitive institutional data, including ePHI research data?
Requesting an account
To request an RDC account:
- IU graduate students, faculty, and staff can use the Account
Management Service (AMS); see Instructions for getting additional computing accounts at IU
- IU undergraduate students and IU affiliates must have faculty sponsors. Your sponsor should submit a request on your behalf via email to the High Performance Systems group. The request should include your IU username and a brief justification of your request.
You will receive a confirmation email message once your RDC account is created.
Your database login
When you receive the email message confirming your RDC account is created, you must complete the process by requesting a database login. The confirmation email message will direct you to the online RDC Database and Web Services Account Application.
Database group accounts, where a username is shared by more than one researcher, are available on the RDC. You can request a database group account on the RDC and Web Services Account Application. To request a database group account, your group must have an existing IU Network ID, and you must provide the Network ID usernames of everyone who will be using the database group account. Whoever requests the group database account will be considered the responsible party for the account, and is responsible for communicating with the group database account's users regarding system downtime and other information. For more about IU group accounts, see Requesting a departmental or group account.
Note: If you have a Linux account on the RDC, but cannot locate the "RDC account is created" notification message that contains the link to the RDC and Web Services Account Application, access the form here. If you need help, email the UITS High Performance Systems (HPS) team.
When your database schema, instance, or server has been created, you will receive another confirmation email message ("Oracle/MySQL Welcome Letter") that includes your login credentials and information about connecting to your database.
Note: If you already have an Oracle or MySQL database login, but don't remember the database password, refer to your "Oracle/MySQL Welcome Letter" notification in email. If you need help, email HPS.
Connecting to your Oracle or MySQL database
Oracle: For instructions on connecting to your Oracle database on the RDC, see:
- On the Research Database Complex at IU, how do I access my Oracle or MySQL database?
- How do I access my database on the IU Research Database Complex using the Oracle client for Windows XP?
- How do I access my Oracle database on the IU Research Database complex using Aqua Data Studio?
- In Windows, how do I set up an ODBC source to access my Oracle database on the IU Research Database Complex?
MySQL: For instructions on connecting to your Oracle database on the RDC, see:
- On the Research Database Complex at IU, how do I access my Oracle or MySQL database?
- At IU, how do I use the phpMyAdmin web interface to administer my MySQL database on the Research Database Complex?
- On the RDC at IU, how do I stop or start my MySQL database?
The shell is the primary method of interacting with the RDC. The command line interface provided by the shell lets users run built-in commands, utilities installed on the system, and even short ad hoc programs.
The RDC supports the Bourne-again (
tcsh), C (
csh), Korn (
and Bourne (
sh) shells. New user accounts are assigned
bash shell by default. For more on
see the Bash
Reference Manual and the Bash (Unix
shell) Wikipedia page.
To change your shell on the RDC, use the
chsh (instead of
changeshell) changes your shell only on the node on which
you run it, and leaves the other nodes of the cluster unchanged;
changeshell prompts you with the shells available on the
system, and changes your login shell system-wide within 15
Environment variables: The shell uses environment variables primarily to modify shell behavior and the operation of certain commands. A good example is the PATH variable.
When the shell parses a command you have entered (i.e., after you
Return), it interprets certain
words you've typed as program files that should be executed. The shell
then searches various directories on the system to locate these
files. The PATH variable determines which directories are searched,
and the order in which they are searched. In the
shell, the PATH variable is a string of directories separated by
/bin:/usr/bin:/usr/local/bin). The shell
searches for an executable file in the
/usr/bin directory, and finally the
/usr/local/bin directory. If files of the same name
foo) exist in all three directories,
/bin/foo will be run, because the shell will find it
bash shell, use
echo to display
the value of an environment variable:
To change the value of an environment variable:
Startup scripts: Shells offer much flexibility in
terms of startup configuration. On login,
bash by default
reads and executes commands from the following directories (and in
~ (tilde) represents your
home directory (e.g.,
~/.bash_profile is the
.bash_profile file in your home directory).
On logout, the shell reads and executes
~/.bash_logout. For more on
files, see the "Bash
Startup Files" section of the Bash Reference Manual.
Transferring your files to the RDCscp [[user@]host1:]file1 [[user@]host2:]file2
For example, to copy
foo.txt from the current
directory on your computer to your home directory on the RDC, use
username with your Network ID
scp foo.txt firstname.lastname@example.org:foo.txt
You may specify absolute paths or paths relative to your home directory:scp foo.txt email@example.com:some/path/for/data/foo.txt
You also may leave the destination filename unspecified, in which case it will become the same as the source filename. For more, see In Unix, how do I use SCP to securely transfer files between two computers?
The SSH File Transfer Protocol (SFTP) provides file access, transfer, and management, and offers client functionality much like FTP. For example, from a computer with a command line SFTP client (e.g., a Linux or Mac OS X workstation), you could transfer files as follows:$ sftp firstname.lastname@example.org email@example.com's password: Connected to rdc.uits.iu.edu. sftp> ls -l -rw------- 1 username group 113 May 19 2011 loadit.pbs.e897 -rw------- 1 username group 695 May 19 2011 loadit.pbs.o897 -rw-r--r-- 1 username group 693 May 19 2011 local_limits sftp> put foo.txt Uploading foo.txt to /N/hd02/username/RDC/foo.txt foo.txt 100% 43MB 76.9KB/s 09:39 sftp> exit $
Graphical SFTP clients are also available for many systems. For more, see Transferring files with SFTP
Web Services: In addition to providing a home for
research databases, the RDC provides an environment for
database-driven web applications with a research focus. This
environment is composed of a Dell 2950 with a 1.6 GHz Quad-core Intel
Xeon processor and 8 GB of memory. This system
rdcweb.uits.iu.edu) runs RHEL 5. User home directories
reside on the IBM N5500 NAS storage device. You have a 10 GB (default)
disk quota, which is shared with Big Red II, Quarry, and Mason, if you
have accounts on those systems.For details, see Web Services on the IU Research Database Complex.
Oracle: For further documentation, see the Oracle Database Documentation Library, 11g Release 2 (11.2), and the following online guides:
- Oracle Database New Features Guide
- Oracle Database Concepts
- Oracle Database Online Documentation Library Master Glossary
- Advanced Application Developer's Guide
- Oracle Database Reference
- SQL Language Reference
- PL/SQL Language Reference
- Application Express (ApEx) Documentation
- ApEx Developer's Guide
- ApEx Application Builder User's Guide
MySQL: For further documentation, see the MySQL Reference Manual.
Access to the RDC is provided to all IU graduate students, faculty, and staff. Access is also provided to undergraduate students and non-IU collaborators, if they have IU faculty sponsors. For information about user responsibilities and security issues, see Research Database Complex (RDC) usage policies.
The RDC is strictly devoted to supporting research. The RDC is not an instructional, classroom environment. If you are not doing research and wish to use a database, such as Oracle or Microsoft SQL Server, see Database and web server access for instruction.
Accounts remain valid only while the account holder is a registered IU student, or an IU faculty or staff member. On Big Red II, Quarry, and the RDC, accounts are disabled during the semester following the account holder's departure from IU, and then are purged within six months. To request that your research systems account be exempt from disabling, email IU Account Administration. If the request is approved, the account will remain activated for one calendar year beyond the user's departure from IU, and then, at the end of the year, the account will be purged. Extensions beyond one year for research accounts are granted only for accounts involved in funded research and having an IU faculty sponsor, or with approval of the Dean or Director of Research and Academic Computing.
By submitting the RDC and Web Services Account Application, you affirm that:
- You understand use of the database is reserved for research purposes only.
- You will acknowledge use of IU's high-performance systems in publications resulting from your research.
- You will provide periodic listings of citations of those publications upon request.
Database group accounts
To request a database group account, your group must have an existing IU Network ID, and you must provide the Network ID usernames of everyone who will be using the database group account. Whoever requests the group database account will be considered the responsible party for the account, and is responsible for communicating with the group database account's users regarding system downtime and other information. For more about IU group accounts, see Requesting a departmental or group account
As owner of a database account, you are also responsible for:
- Creating and managing your schema objects (e.g., tables, views, procedures, triggers, and schema privileges)
- Changing datatypes
- Any data processes, such as data imports, deletes, modifications, transformations, and retrievals
- Creating and maintaining copies of scripts
- Emailing the HPS group about changes in space or database administration, or if you no longer need access to the research database
- Adapting your schema and data as required during system and database upgrades
- Adapting client applications and tools to system and database versions
- Monitoring HIPAA required audit logs, if auditing is enabled
If you need help with the above, submit a request for RDC database consulting services by emailing the HPS group.
RDC database administrators are responsible for:
- Backing up the database
- Managing space allocation
- Managing database and tablespace creation
- Monitoring and reporting database performance
- Monitoring and reporting invalid schema objects
- Installing database and system upgrades and patches
Many of the technology services provided by the UITS Advanced Biomedical IT Core, Research Technologies, and Enterprise Infrastructure divisions are formally aligned with the federal Health Information Portability and Accountability Act (HIPAA). See What services does IU provide for researchers working with ePHI data?
Once your RDC Database and Web Services Account Application is processed, you will receive a confirmation email message that describes how to access your Oracle database, and how to reset your initial database password. Your initial database password will be sent in a second email message.
It is important to employ methods that do not transmit passwords across the Internet in plain-text format. If you use SQL*Plus to access an Oracle database, invoke it without the password. The following example shows how to connect to Oracle fromdoe@RDC:~> sqlplus firstname.lastname@example.org
Provide the password when prompted:SQL*Plus: Release 10.2.0.1.0 - Production on Fri Mar 14 13:49:03 2008 Copyright (c) 1982, 2005, Oracle. All rights reserved. Enter password:
To connect to MySQL fromdoe@RDC: $mysql --defaults-file=/N/u/<username>/RDC/.my.cnf -u root -p
usernamewith your username. When prompted, enter your password.
Database backup and recovery
UITS performs incremental backups of the RDC Oracle databases at various times between 1am and 6am, Sunday through Friday, depending on the instance. A full backup occurs between 1am and 5am every Saturday. Backups are retained for 30 days. In the event of system failure, research databases can be restored to the point of the last good backup, which is usually from that morning. Data recovery for individual accounts is not guaranteed if data loss is the result of user error.
Recovery of a table is typically not part of the
database recovery process. Dropped tables can often be recovered
recyclebin feature; see In Oracle 10g and later, how do I recover a dropped database table?
Backups for MySQL database servers on the RDC are the responsibility of the user. Use the following command:mysql_instance backup
For more about using the
mysql_instance command, see
On the RDC at IU, how do I stop or start my MySQL database?
Disk space for data loading and other applications
If you need space on the RDC for staging data, or for data-related packages and applications, email the HPS group, which will evaluate requests on a case-by-case basis.
Production mail service is not provided on the RDC.
Support staff are available from approximately 8am-5pm Monday-Friday. Email the HPS group to report problems with the RDC. Be sure to include:
- The name of the database server to which you were connecting
- A description of what you were trying to do
- A description of the problem
- The name and version of the tool you were using to connect to the Oracle database server
- Your computer's operating system
- The error message number and text, if applicable
- The time the problem occurred
Acknowledging grant support
The Indiana University cyberinfrastructure managed by the Research Technologies division of UITS is supported by funding from several grants, each of which requires you to acknowledge its support in all presentations and published works stemming from research it has helped to fund. Conscientious acknowledgment of support from past grants also enhances the chances of IU's research community securing funding from grants in the future. For the acknowledgment statement(s) required for scholarly printed works, web pages, talks, online publications, and other presentations that make use of this and/or other grant-funded systems at IU, see Grants to cite in published papers and presentations
- If you have system-specific questions about Big Red II, Quarry,
Mason, or the Research Database Complex (RDC), email the High
Performance Systems team.
- If you have questions about compilers, programming, scientific and
numerical libraries, or debuggers on a research computing system, email Scientific
Applications and Performance Tuning team.
- If you have questions about statistical and mathematical software
on any of the research computing systems, email the
Research Analytics group.
- If you have questions about scratch or project space on the Data
Capacitor II or Data Capacitor wide-area network (DCWAN) file system,
email the High
Performance File Systems team.
- If you have questions about the Research File System (RFS) or Scholarly Data Archive (SDA), email the Research Storage team.
To ask any other question about Research Technologies systems and services, use the Request help or information form.