Running ClustalW on Big Red
On Big Red, you can use ClustalW to align multiple sequences. A parallel implementation of ClustalW 1.82 (ClustalW-MPI) is installed at:
/N/soft/`whatami`/clustalw-mpi-1.14The README files are in:
The clustalwjob script submits a job that runs
ClustalW. The clustalwjob script should be in your path
by default, and its manual page should be in your default
path for manual pages. Syntax for clustalwjob is:
Replace options_to_clustalw with command line options,
n with the number of processors to use, and
h with the maximum amount of time the job should be
allowed to run. If you omit the CPUS option, four
processors will be used. To request more than four processors, specify
an integer value that is a multiple of 4. If you specify a value that
is not a multiple of 4, the value will be increased to the next
multiple of 4. The maximum number of processors is 128 (unless you
also specify a larger queue; see the clustalwjob man
page). For example, to use 16 processors to align amino acid sequences
in file aaseqs, run:
If you omit the -wallhours option, your job will be
allowed to run for two hours. Use the -wallhours option
to request more time, up to 336 hours (14 days). Queues other than the
default have lower time limits; see the clustalwjob man
page.)
Options to ClustalW are usually available by entering the command with no argument, but this feature is not available in the parallel version. Options are listed in:
/N/soft/`whatami`/clustalw-mpi-1.14/README.OPTIONSOptions are listed here for your convenience:
CLUSTAL W (1.82) Multiple Sequence Alignments clustalw option list:- -help -check -options -align -newtree=filename -usetree=filename -newtree1=filename -usetree1=filename -newtree2=filename -usetree2=filename -bootstrap -tree -quicktree -convert -interactive -batch -infile=filename -profile1=filename -profile2=filename -type=protein OR dna -profile -sequences -matrix=filename -dnamatrix=filename -negative -noweights -gapopen=f -gapext=f -endgaps -nopgap -nohgap -novgap -hgapresidues=string -maxdiv=n -gapdist=n -pwmatrix=filename -pwdnamatrix=filename -pwgapopen=f -pwgapext=f -ktuple=n -window=n -pairgap=n -topdiags=n -score=percent OR absolute -transweight=f -seed=n -kimura -tossgaps -bootlabels=node OR branch -debug=n -output=gcg OR gde OR pir OR phylip OR nexus -outputtree=nj OR phylip OR dist OR nexus -outfile=filename -outorder=input OR aligned -case=lower OR upper -seqnos=off OR on -nosecstr1 -nosecstr2 -secstrout=structure OR mask OR both OR none -helixgap=n -strandgap=n -loopgap=n -terminalgap=n -helixendin=n -helixendout=n -strandendin=n -strandendout=nWhen you run clustalwjob, you'll receive a message
when your job is submitted to the queue, and another when the job
finishes. To check the status of your job, use the llq
command.
In addition to output files that clustalwjob produces,
such as .aln files, clustalwjob will produce
files with filenames similar to clustalwjob.9999.err and
clustalwjob.9999.out, where 9999 is the
number of your job. Such files contain information that
clustalwjob would print to the screen if you were running
it interactively from the command line.

