Indiana University

Mason at Indiana University

On this page:


System overview


Mason (mason.indiana.edu) at Indiana University is a large memory computer cluster configured to support data-intensive, high-performance computing tasks for researchers using genome assembly software (particularly software suitable for assembly of data from next-generation sequencers), large-scale phylogenetic software, or other genome analysis applications that require large amounts of computer memory. At IU, Mason accounts are available to IU faculty, postdoctoral fellows, research staff, and students involved in genome research. IU educators providing instruction on genome analysis software, and developers of such software, are also welcome to use Mason. IU has also made Mason available to genome researchers from the National Science Foundation's Extreme Science and Engineering Discovery Environment (XSEDE) project.

Mason consists of 18 Hewlett-Packard (HP) DL580 servers, each containing four Intel Xeon L7555 8-core processors and 512 GB of RAM, and two HP DL360 login nodes, each containing two Intel Xeon E5-2600 processors and 24 GB of RAM. The total RAM in the system is 9 TB. Each server chassis has a 10-gigabit Ethernet connection to the other research systems at IU and the XSEDE network (XSEDENet).

Mason nodes run Red Hat Enterprise Linux (RHEL 6.x). The system uses TORQUE integrated with Moab Workload Manager to coordinate resource management and job scheduling. The Data Capacitor II and Data Capacitor wide-area network (DCWAN) parallel file systems are mounted for temporary storage of research data. The Modules environment management package on Mason allows users to dynamically customize their shell environments.

Back to top

System information

Note: The scheduled monthly maintenance window for Mason is the first Tuesday of each month, 7am-7pm.

System summary
Machine type High-performance, data-intensive computing
Operating system Red Hat Enterprise Linux 6.x
Memory model Distributed
Nodes 18 HP DL580 servers
2 HP DL360 servers
Network 10-gigabit Ethernet per node
Computational system details Total Per node
CPUs 72 Intel Xeon L7555 8-core processors 4 Intel Xeon L7555 8-core processors
Processor cores 576 32
RAM 9 TB 512 GB
Local storage 7 TB 400 GB
Processing capability Rpeak = 4,285 gigaFLOPS Rpeak = 238 gigaFLOPS
Benchmark data HPL gigaFLOPS
3129.75
HPL gigaFLOPS
222.22
Power usage 0.000153 teraFLOPS per watt 0.000173 teraFLOPS per watt

Back to top

File systems (storage for IU users)

You can store files on your home directory or in scratch space:

  • Home directory: Your Mason home directory disk space is allocated on a Network-Attached Storage (NAS) device. You have a 10 GB disk quota, which is shared (if applicable) with your accounts on Big Red II, Quarry, and the Research Database Complex (RDC).

    The path to your home directory is (replace username with your Network ID username):

    /N/u/username/Mason
  • Local scratch: 400 GB of scratch disk space is available locally. Local scratch space is not intended for permanent storage of data, and is not backed up. Files in local scratch space are automatically deleted once they are 14 days old.

    Local scratch space is accessible via either of the following paths:

    /scratch /tmp
  • Shared scratch: Once you have an account on one of the UITS research computing systems, you also have access to 427 TB of shared scratch space.

    Shared scratch space is hosted on the Data Capacitor II (DC2) file system. The DC2 scratch directory is a temporary workspace. Scratch space is not allocated, and its total capacity fluctuates based on project space requirements. The DC2 file system is mounted on IU research systems as /N/dc2/scratch and behaves like any other disk device. If you have an account on an IU research system, you can access /N/dc2/scratch/username (replace username with your IU Network ID username). Access to /N/dc2/projects requires an allocation. For details, see The Data Capacitor II and DCWAN high-speed file systems at Indiana University. Files in shared scratch space more than 60 days old are periodically purged, following user notification.

    Note: The Data Capacitor II (DC2) high-speed, high-capacity, storage facility for very large data sets replaces the former Data Capacitor file system, which was decommissioned January 7, 2014. The DC2 scratch file system (/N/dc2/scratch) is mounted on Big Red II, Quarry, and Mason. Project directories on the former Data Capacitor were migrated to DC2 by UITS before the system was decommissioned. All data on the Data Capacitor scratch file system (/N/dc/scratch) were deleted when the system was decommissioned. If you have questions about the Data Capacitor's retirement, email the UITS High Performance File Systems group.

For more, see At IU, how much disk space is available to me on the research computing systems?

Note: IU graduate students, faculty, and staff who need more than 10 GB of permanent storage can apply for accounts on the Research File System (RFS) and the Scholarly Data Archive (SDA). See At IU, how can I apply for an account on the SDA or RFS?

Back to top

Working with ePHI research data

The Health Insurance Portability and Accountability Act of 1996 (HIPAA) established rules protecting the privacy and security of personal health data. The HIPAA Security Rule set national standards specifically for the security of protected health information (PHI) that is created, stored, transmitted, or received electronically (i.e., electronic protected health information, or ePHI). To ensure the confidentiality, integrity, and availability of ePHI data, the HIPAA Security Rule requires organizations and individuals to implement a series of administrative, physical, and technical safeguards when working with ePHI data.

Although you can use this system for processing or storing electronic protected health information (ePHI) related to official IU research:

  • You and/or the project's principal investigator (PI) are responsible for ensuring the privacy and security of that data, and complying with applicable federal and state laws/regulations and institutional policies. IU's policies regarding HIPAA compliance require the appropriate Institutional Review Board (IRB) approvals and a data management plan.

  • You and/or the project's PI are responsible for implementing HIPAA-required administrative, physical, and technical safeguards to any person, process, application, or service used to collect, process, manage, analyze, or store ePHI data.

The UITS Advanced Biomedical IT Core provides consulting and online help for Indiana University researchers who need help securely processing, storing, and sharing ePHI research data. If you need help or have questions about managing HIPAA-regulated data at IU, contact the ABITC. For additional details about HIPAA compliance at IU, see HIPAA & ABITC and the Office of Vice President and General Counsel (OVPGC) HIPAA Privacy & Security page.

Important: Although UITS HIPAA-aligned resources are managed using standards surpassing official standards for managing institutional data at IU and are appropriate for storing HIPAA-regulated ePHI research data, they are not recognized by the IU Committee of Data Stewards as appropriate for storing institutional data elements classified as Critical that are not ePHI data. For help determining which institutional data elements classified as Critical are considered ePHI, see Which data elements in the classifications of institutional data are considered protected health information (PHI)?

The IU Committee of Data Stewards and the University Information Policy Office (UIPO) set official classification levels and data management standards for institutional data in accordance with the university's Management of Institutional Data policy. If you have questions about the classifications of institutional data, contact the appropriate Data Steward. To determine the most sensitive classification of institutional data you can store on any given UITS service, see the "Choosing an appropriate storage solution" section of At IU, which dedicated file storage services and IT services with storage components are appropriate for sensitive institutional data, including ePHI research data?

Note: In accordance with standards for access control mandated by the HIPAA Security Rule, you are not permitted to access ePHI data using a group (or departmental) account. To ensure accountability and enable only authorized users to access ePHI data, IU researchers must use their personal Network ID credentials for all work involving ePHI data.

Back to top

System access

Access policy

Although Mason is an IU resource dedicated to genome analysis research, access to the cluster is not restricted to IU researchers:

For information about your responsibilities as a user of this resource, see:

Accessing Mason

  • IU and NCGAS users: Use an SSH2 client to connect to: mason.indiana.edu

    This resolves to one of the following login nodes:

    h1.mason.indiana.edu h2.mason.indiana.edu

    IU users authenticate with their Network ID usernames and passphrases.

    NCGAS researchers authenticate with the credentials associated with their NCGAS allocations.

    IU and NCGAS users also may set up public key authentication; see How do I set up SSH public-key authentication to connect to a remote system?

    Note: Mason login nodes use the IU Active Directory Service for user authentication. As a result, local passwords/passphrases are not supported. For information about changing your ADS passphrase, see Changing your passphrase For helpful information regarding secure passphrases, see About your IU passphrase.

  • XSEDE users: Use GSI-SSH with your XSEDE-wide login from the Single Sign-On Login Hub or one of the GSI-SSH desktop clients.

    Alternatively, if you're connected to another XSEDE system via SSH (i.e., not using one of the GSI-SSH methods), you can connect to Mason directly using GSI-SSH and a MyProxy certificate. For example, from Trestles (SDSC):

    1. Make sure the globus module is loaded: [dartmaul@trestles-login2 ~]$ module load globus
    2. Get a certificate from the MyProxy server; when prompted, provide your XSEDE-wide password: [dartmaul@trestles-login2 ~]$ myproxy-logon -s myproxy.teragrid.org Enter MyProxy pass phrase: **************** A credential has been received for user dartmaul in /tmp/x509up_p13346.fileGzdqtd.1.
    3. Use GSI-SSH to connect to your XSEDE account on Mason (replace username with your XSEDE username): [dartmaul@trestles-login2 ~]$ gsissh username@mason.iu.xsede.org

    Additionally, XSEDE users can set up public key authentication. For more, see Access Resources on the XSEDE User Portal.

Back to top

Available software

Mason uses the Modules package to provide a convenient method for dynamically adding software packages to your user environment.

For a list of software available on Mason, see Software supported by NCGAS.

Note: Mason users are free to install software in their home directories and may request the installation of software for use by all users on the system. Only faculty or staff can request software. If students require software packages on Mason, their advisors must request them. For details, see At IU, what is the policy about installing software on Mason?

Some common Modules commands include:

Command Action
module avail List all software packages available on the system.
module avail package List all versions of package available on the system, for example:

module avail openmpi
module list List all packages currently loaded in your environment.
module load package/version Add the specified version of the package to your environment, for example:

module load intel/11.1

To load the default version of the package, use:

module load package
module unload package Remove the specified package from your environment.
module swap package_A package_B Swap the loaded package (package_A) with another package (package_B).

This is synonymous with:

module switch package_A package_B
module show package Shows what changes will be made to your environment (e.g., paths to libraries and executables) by loading the specified package.

This is synonymous with:

module display package

To make permanent changes to your environment, edit your ~/.modules file. For more, see In Modules, how do I save my environment with a .modules file?

For information about using Modules on IU research systems, see On Big Red II, Mason, Quarry, and Rockhopper at IU, how do I use Modules to manage my software environment? For information about using Modules on XSEDE digital services, see On XSEDE, how do I manage my software environment using Modules?

For general information about the Modules package, see the module manual page (man module) and the modulefile manual page (man modulefile).

Back to top

Computing environment

The shell

The Linux shell is both a command interpreter and a programming language. The shell command line provides a user interface for invoking commands to execute various shell functions, built-in utilities, and executable files. You can combine shell commands in a text file to create a shell script, and then invoke the shell script on the command line. As a result, the shell reads and executes the commands from the shell script.

Mason supports the Bourne-again shell (bash) and the TENEX C shell (tcsh). New users are assigned the bash shell by default. To change your shell on Mason, use the changeshell command.

Note: Running chsh (instead of changeshell) changes your shell only on the node on which you run it, and leaves the other nodes of the cluster unchanged; changeshell prompts you with the shells available on the system, and changes your login shell system-wide within 15 minutes.

For more about bash, see the Bash Reference Manual. For more about tcsh, see the TCSH manual page.

Environment variables

Environment variables are named values that can affect shell behavior and the operation of certain commands.

The PATH variable contains a string of directories separated by colons:

/bin:/usr/bin:/usr/local/bin

This tells the shell where (in which directories) and how (in what order) to search for functions, built-in utilities, or executable files that correspond to user-entered commands. According to the above example, the shell would search the following directories in this order:

/bin /usr/bin /usr/local/bin

If executables with identical filenames exist in more than one of the directories in the PATH variable (e.g., /bin/hyperdrive, /usr/bin/hyperdrive, and /usr/local/bin/hyperdirve, the shell will execute the first file it finds (e.g., /bin/hyperdrive).

To display the values of environment variables set for your user environment on Mason, on the command line, enter:

env

To display the value of a particular environment variable (e.g., VARNAME), enter:

echo $VARNAME

To change the value of an environment variable (e.g., VARNAME), enter:

  • In bash: export VARNAME=NEW_VALUE
  • In tcsh: setenv VARNAME NEW_VALUE

Startup files

When you log into Mason, your shell executes commands from certain startup files.

Depending on your shell, when you log into Mason:

  • The bash shell reads and executes commands from the following startup files (and in this order): /etc/profile ~/.bash_profile ~/.bashrc

    Note: The ~ (tilde) represents your home directory (e.g., ~/.bash_profile is the .bash_profile file in your home directory).

  • The tcsh shell reads and executes commands from the following startup files (and in this order): /etc/csh.cshrc /etc/csh.login

For more on bash shell startup files, see the Bash Startup Files in the Bash Reference Manual. For more on tcsh shell startup files, see Startup and shutdown in the TCSH manual page.

Back to top

Transferring your files to Mason

Mason supports SCP and SFTP for transferring files:

  • SCP: The SCP command line utility is included with OpenSSH. Basic use is: scp username@host1:file1 username@host2:file2

    For example, to copy foo.txt from the current directory on your computer to your home directory on Mason, use (replacing username with your username): scp foo.txt username@mason.indiana.edu:foo.txt

    You may specify absolute paths or paths relative to your home directory:

    scp foo.txt username@mason.indiana.edu:some/path/for/data/foo.txt

    You also may leave the destination filename unspecified, in which case it will become the same as the source filename. For more, see In Unix, how do I use SCP to securely transfer files between two computers?

  • SFTP: SFTP clients provide file access, transfer, and management, and offer functionality similar to FTP clients. For example, using a command-line SFTP client (e.g., from a Linux or Mac OS X workstation), you could transfer files as follows: $ sftp username@mason.indiana.edu username@mason.indiana.edu's password: Connected to mason.indiana.edu. sftp> ls -l -rw------- 1 username group 113 May 19 2011 loadit.pbs.e897 -rw------- 1 username group 695 May 19 2011 loadit.pbs.o897 -rw-r--r-- 1 username group 693 May 19 2011 local_limits sftp> put foo.txt Uploading foo.txt to /N/hd00/username/Mason/foo.txt foo.txt 100% 43MB 76.9KB/s 09:39 sftp> exit $

    For more, see What is SFTP, and how do I use an SFTP client to transfer files?

Additionally, XSEDE researchers can use GridFTP (via globus-url-copy or Globus Online) to securely move data to and from Mason's GridFTP endpoint:

gsiftp://gridftp.mason.iu.xsede.org:2811/

For more on XSEDE data transfers, see What data transfer methods are supported on XSEDE, and where can I find more information about data transfers?

Back to top

Application development

Programming models

Mason is designed to support codes that have extremely large memory requirements. As these codes typically do not implement a distributed memory model, Mason is geared toward a serial or shared-memory parallel programming paradigm. However, Mason can support distributed memory parallelism.

Compilers

The GNU Compiler Collection (GCC) is added by default to your user environment on Mason. The Intel and Portland Group (PGI) compiler collections, and the Open MPI and MPICH wrapper compilers, are also available.

Recommended optimization options are -O3 and -xHost (the -xHost option will optimize based on the processor of the current host).

For the GCC compilers, the mtune=native and march=native options are recommend to generate instructions for the machine and CPU type.

Following are example commands for compiling serial and parallel programs on Mason:

  • Serial programs:

    • To compile the C program simple.c :

      • With the GCC compiler: gcc -O2 -o -mtune=native -march=native simple simple.c
      • With the Intel compiler: icc -o simple simple.c
    • To compile the Fortran program simple.f:

      • With the GCC compiler: g77 -o simple simple.f
      • With the Intel compiler: ifort -O2 -o simple -lm simple.f
  • Parallel programs:

    • To compile the C program simple.c with the MPI wrapper script: mpicc -o simple simple.c
    • To compile the Fortran program simple.f with the MPI wrapper script: mpif90 -o simple -O2 simple.f
    • To use the GCC C compiler to compile simple.c to run in parallel using OpenMP: gcc -O2 -fopenmp -o simple simple.c
    • To use the Intel Fortran compiler to compile simple.f to run in parallel using OpenMP: ifort -openmp -o simple -lm simple.f

Libraries

Both the Intel Math Kernel Library (MKL) and the AMD Core Math Library (ACML) are available on Mason.

Debugging

Both the Intel Debugger (IDB) and the GNU Project Debugger (GDB) are available on Mason.

For information about using the IDB, see the Intel IDB page.

For information about using the GDB, see the GNU GDB page. For an example, see Step-by-step example for using GDB within Emacs to debug a C or C++ program.

Back to top

Queue information

The BATCH queue is the default, general-purpose queue on Mason. The default walltime is one hour; the maximum limit is two weeks. If your job requires more than two weeks of walltime, email the High Performance Systems group for assistance.

Note: To best meet the needs of all research projects affiliated with Indiana University, the High Performance Systems (HPS) team administers the batch job queues on UITS Research Technologies supercomputers using resource management and job scheduling policies that optimize the overall efficiency and performance of workloads on those systems. If the structure or configuration of the batch queues on any of IU's supercomputing systems does not meet the needs of your research project, fill out and submit the Research Technologies Ask RT for Help form (for "Help Needed", select High Performance Systems job or queue help).

Back to top

Requesting single user time

Although UITS Research Technologies cannot provide dedicated access to an entire compute system during the course of normal operations, "single user time" is made available by request one day a month during each system's regularly scheduled maintenance window to accommodate IU researchers with tasks requiring dedicated access to an entire compute system. To request single user time on one of IU's research computing systems, fill out and submit the Research Technologies Ask RT for Help form (for "Help Needed", select Request to run jobs in single user time on HPS systems). If you have questions about single user time on IU research computing systems, email the HPS team.

Back to top

Running jobs on Mason

Mason uses the TORQUE resource manager (based on OpenPBS) and the Moab Workload Manager to manage and schedule jobs. For information about using TORQUE on Mason, see What is TORQUE, and how do I use it to submit and manage jobs on high-performance computing systems? Moab uses fairshare scheduling to track usage and prioritize jobs. For information on fairshare scheduling and using Moab to check the status of batch jobs, see:

CPU/Memory limits and batch jobs

User processes on the login nodes are limited to 20 minutes of CPU time. Processes exceeding this limit are automatically terminated without warning. If you require more than 20 minutes of CPU time, use the TORQUE qsub command to submit a batch job (see Submitting jobs below).

Implications of these limits on the login nodes are as follows:

  • The Java Virtual Machine must be invoked with a maximum heap size. Because of the way Java allocates memory, under ulimit conditions an error will occur if Java is called without the -Xmx##m flag.

  • Memory-intensive jobs started on the login nodes will be killed almost immediately. Debugging and testing on Mason should be done by submitting a request for an interactive job via the batch system, for example: qsub -I -q shared -l nodes=1:ppn=4,vmem=10gb,walltime=4:00:00

    The interactive session will start as soon as the requested resources are available.

Submitting jobs

To submit a job to run on Mason, use the TORQUE qsub command. If the command exits successfully, it will return a job ID, for example:

[jdoe@Mason]$ qsub job.script 123456.m1.mason [jdoe@Mason]$

If you need attribute values different from the defaults, but less than the maximum allowed, specify these either in the job script using TORQUE directives, or on the command line with the -l switch. For example, to submit a job that needs more than the default 60 minutes of walltime, use:

qsub -l walltime=10:00:00 job.script

Jobs on Mason default to a per-job virtual memory resource of 8 MB. So, for example, to submit a job that needs 100 GB of virtual memory, use:

qsub -l nodes=1:ppn=4,vmem=100gb job.script

Note: Command-line arguments override directives in the job script, and you may specify many attributes on the command line, either as comma-separated options following the -l switch, or each with its own -l switch. The following two commands are equivalent:

qsub -l nodes=1:ppn=16,vmem=1024mb job.script qsub -l nodes=1:ppn=16 -l vmem=1024mb job.script

Useful qsub options include:

Option Action
-a <date_time> Execute the job only after specified date and time.
-I Run the job interactively. (Interactive jobs are forced to not re-runnable.)
-m e Mail a job summary report when the job terminates.
-q <queue name> Specify the destination queue for the job. (Not applicable on Mason.)
-r [y|n] Declare whether the job is re-runnable. Use the argument n if the job is not re-runnable. The default value is y (re-runnable).
-V Export all environment variables in your current environment to the job.

For more, see the qsub manual page.

Monitoring jobs

To monitor the status of a queued or running job, use the TORQUE qstat command. Useful qstat options include:

Option Action
-a Display all jobs.
-f Write a full status display to standard output.
-n List the nodes allocated to a job.
-r Display jobs that are running.
-u user1@host,user2@host Display jobs owned by specified users.

For more, see the qstat manual page.

Deleting jobs

To delete a queued or running job, use the qdel command.

Occasionally, a node will become unresponsive and unable to respond to the TORQUE server's requests to kill a job. In such cases, try using qdel -W <delay> to override the delay between SIGTERM and SIGKILL signals (for <delay>, specify a value in seconds).

For more, see the qdel manual page.

Back to top

Reference

Back to top

Support

For IU and NCGAS users

Support for IU and NCGAS users is provided by the UITS High Performance Systems (HPS) and Scientific Applications and Performance Tuning (SciAPT) groups, and by the National Center for Genome Analysis Support (NCGAS):

  • If you have system-specific questions about Mason, email the HPS group.

  • If you have questions about compilers, programming, scientific/numerical libraries, or debuggers on Mason, email the SciAPT group.

  • If you need help installing software packages in your home directory on Mason, email NCGAS.

For XSEDE users

XSEDE users with questions about hardware or software on Mason should contact the XSEDE Help Desk or consult the Indiana University Mason User Guide on the XSEDE User Portal.

For more about XSEDE compute, advanced visualization, storage, and special purpose systems, see the Resources Overview, Systems Monitor, and User Guides. For scheduled maintenance windows, outages, and other announcements related to XSEDE digital services, see User News.

Back to top

This document was developed with support from National Science Foundation (NSF) grant OCI-1053575. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSF.