TSCC User Guide

Last Updated May 17, 2019

Technical Summary

The Triton Shared Computing Cluster (TSCC) is UC San Diego’s primary research HPC system. It is foremost a "condo cluster" (researcher-purchased computing hardware) that provides access, colocation, and management of a significant shared computing resource, as well as providing a "hotel" service for those with temporary or bursty HPC needs. TSCC provides three kinds of compute nodes in the cluster: General Computing Nodes, GPU Nodes, and Large Memory Nodes.

TSCC is continuously open to new purchases with preferred vendor pricing. See the Purchase Info page for Free Trial information, how to join the Triton Shared Computing Cluster, allocation information, and current specs for available hardware.

System Information

Hardware Specifications

There are three kinds of compute nodes in the cluster: General Computing Nodes, GPU Nodes, and Large Memory Nodes. The TSCC group will periodically update the hardware choices for general computing and GPU condo node purchases, to keep abreast of technological and cost advances. The current specifications for each type of node can be found on the Purchase Info page.

Network

TSCC nodes can have either a single port 10GbE network interface or an optional EDR InfiniBand (IB) connect to  a 32–port IB switch, allowing up to 896 cores to communicate at full bisection bandwidth for low latency parallel computing. 

Storage

TSCC users will receive 100GB of backed-up home file storage, and shared access to the 800+ TB Data Oasis Lustre-based high performance parallel file system. There is a 90–day purge policy on Data Oasis. Data Oasis storage is not backed up and files stored on this system may be lost or destroyed without recourse to restore them. Long-term file storage should be maintained in your $HOME directory or project storage. Project storage can be purchased by sending e-mail to tscc-info@ucsd.edu

Additional persistent storage can be mounted from lab file servers over the campus network or purchased from SDSC.  For more information, contact tscc-info@ucsd.edu.

Back to top

System Access

Logging In

TSCC supports command line authentication using your UCSD AD password, or via ssh keys (for non-UCSD users).

To login to the TSCC, use the following hostname:

tscc-login.sdsc.edu

Following are examples of Secure Shell (ssh) commands that may be used to login to the TSCC:

ssh <your_username>@tscc-login.sdsc.edu
ssh -l <your_username> tscc-login.sdsc.edu

Generating SSH Keys

TSCC allows all users to access TSCC via RSA keys. Please feel free to append your public RSA key to your ~/.ssh/authorized_keys file to enable access from authorized hosts without having to enter your password. Make sure you have a password on the private key on your local machine. You can use ssh-agent or keychain to avoid repeatedly typing the private key password. To set up ssh keypairs, please follow the instructions for generating ssh keys for Linux/Mac  or Windows using Putty For non-UCSD users please send public key to tscc-support@ucsd.edu

Back to top

Environment Modules

Managing Your Shell Environment

TSCC uses the Environment Modules package to control user environment settings. Below is a brief discussion of its common usage. You can learn more at the Modules home page.

Overview

The Environment Modules package provides for dynamic modification of a shell environment. Module commands set, change, or delete environment variables, typically in support of a particular application. They also let the user choose between different versions of the same software or different combinations of related codes.

For example, if the pgi module and openmpi_ib module are loaded and the user compiles with mpif90, the generated code is compiled with the Portland Group Fortran 90 compiler and MPI libraries utilizing openmpi are linked. By unloading the openmpi_ib module, loading the mvapich2_ib module, and compiling with mpif90, the Portland compiler is used but linked with mvapich2 support.

Back to top

Default Modules

Several modules that determine the default TSCC environment are loaded at login time. These include the intel and openmpi_ib modules to set the default compiler environment.

Useful Modules Commands

Here are some common module commands and their descriptions:

Command Description

module list

List the modules that are currently loaded

module avail

List the modules that are available

module display <module_name>

Show the environment variables used by <module name> and how they are affected

module unload <module name>

Remove <module name> from the environment

module load <module name>

Load <module name> into the environment

module swap <module one> <module two>

Replace <module one> with <module two> in the environment

 

Note that the order in which modules are loaded is significant. For example, if the pgi module is loaded and subsequently the intel module is loaded, the intel compiler will be used. Also, some modules depend on others so may be loaded or unloaded as a consequence of another module command. For example, if intel and mvapich2_ib modules are both loaded, running the command unload intel will automatically unload mvapich2_ib. Subsequently issuing the load intel command will not automatically reload mvapich2_ib.

If you find yourself regularly using a set of module commands, you may want to add these to your configuration files (.bashrc for bash users, .cshrc for C shell users).  Complete documentation is available in the module(1) and modulefile(4) manpages.

Module: command not found

The error message module: command not found is sometimes encountered when switching from one shell to another or attempting to run the module command from within a shell script or batch job.  The reason that the module command may not be inherited as expected is that it is defined as a function for your login shell. If you encounter this error execute the following from the command line (interactive shells) or add to your shell script.

source /etc/profile.d/modules.sh 

Back to top

Compiling

TSCC  provides Intel, Portland Group (PGI), and GNU compilers along with MPI implementations, such as MVAPICH2 and OpenMPI.  Intel and OpenMPI are loaded as default modules.

Using the Intel Compilers (Default)

The Intel compilers and the openmpi MPI implementation will be loaded by default. If you have modified your environment, you can reload by executing the following commands at the Linux prompt or placing in your startup file (~/.cshrc or ~/.bashrc)

module purge
module load intel openmpi_ib

For AVX2 support, compile with the -xHOST option. Note that -xHOST alone does not enable aggressive optimization, so compilation with -O3 is also suggested. The -fast flag invokes -xHOST, but should be avoided since it also turns on interprocedural optimization (-ipo), which may cause problems in some instances.

Intel MKL libraries are available as part of the "intel" modules on TSCC. Once this module is loaded, the environment variable MKL_ROOT points to the location of the mkl libraries. The MKL link advisor can be used to ascertain the link line (change the MKL_ROOT aspect appropriately). To link the MKL libraries, please refer to the Intel MKL Link Line Advisor Web page. This tool accepts inputs for several variables based on your environment and automatically generates a link line for you. When using the output generated by this site, substitute the TSCC path of the Intel MKL for the value $MKLPATH in the generated script. That value is ${MKL_ROOT}/lib/em64t.

For example to compile a C program statically linking 64 bit scalapack libraries on TSCC:

mpicc -o pdpttr.exe pdpttr.c \
    -I$MKL_ROOT/include ${MKL_ROOT}/lib/intel64/libmkl_scalapack_lp64.a \
    -Wl,--start-group ${MKL_ROOT}/lib/intel64/libmkl_intel_lp64.a \
    ${MKL_ROOT}/lib/intel64/libmkl_core.a ${MKL_ROOT}/lib/intel64/libmkl_sequential.a \
    -Wl,--end-group ${MKL_ROOT}/lib/intel64/libmkl_blacs_intelmpi_lp64.a -lpthread -lm

For more information on the Intel compilers: [ifort | icc | mpif77 | mpif90 | mpicc ] -help

Serial

MPI

OpenMP

MPI+OpenMP

Fortran 77

ifort

mpif77

ifort -openmp

mpif77 -openmp

Fortran

ifort

mpif90

ifort -openmp

mpif90 -openmp

C/C++

icc

mpicc

icc -openmp

mpicc -openmp

Back to top

Using the PGI Compilers

The PGI compilers can be loaded by executing the following commands at the Linux prompt or placing in your startup file (~/.cshrc or ~/.bashrc)

module purge
module load pgi mvapich2_ib

For AVX support, compile with -fast

For more information on the PGI compilers: man [pgf77 | pgf90 | pgcc | mpif90 | mpicc]

Serial

MPI

OpenMP

MPI+OpenMP

F77

pgf77

mpif90

pgf77 -mp

mpif90 -mp

Fortran

pgf90

mpif90

pgf90 -mp

mpif90 -mp

C

pgcc

mpicc

pgcc -mp

mpicc -mp

The Portland Group compilers come with the Optimized ACML library (LAPACK/BLAS/FFT).

To link:

pg90/pgf77 myprog.f -llapack -lblas

Back to top

Using the GNU Compilers

The GNU compilers can be loaded by executing the following commands at the Linux prompt or placing in your startup files (~/.cshrc or ~/.bashrc)

module purge
module load gnu openmpi_ib

For AVX support, compile with -mavx. Note that AVX support is only available in version 4.7 or later, so it is necessary to explicitly load the gnu/4.9.2 module until such time that it becomes the default.

For more information on the GNU compilers: man [gfortran | gcc | g++ | mpif90 | mpicc | mpicxx]

Serial

MPI

OpenMP

MPI+OpenMP

Fortran

gfortran

mpif90

gfortran -fopenmp

mpif90 -fopenmp

C

gcc

mpicc

gcc -fopenmp

mpicc -fopenmp

C++

g++

mpicxx

g++ -fopenmp

mpicxx -fopenmp

Notes and Hints

  • The mpif90 and mpicc commands are actually wrappers that call the appropriate serial compilers and load the correct MPI libraries. While the same names are used for the Intel, PGI and GNU compilers, keep in mind that these are completely independent scripts.
  • If you use the PGI or GNU compilers or switch between compilers for different applications, make sure that you load the appropriate modules before running your executables.
  • When building OpenMP applications and moving between different compilers, one of the most common errors is to use the wrong flag to enable handling of OpenMP directives. Note that Intel, PGI, and GNU compilers use the -openmp, -mp, and -fopenmp flags, respectively.
  • Explicitly set the optimization level in your makefiles or compilation scripts. Most well written codes can safely use the highest optimization level (-O3), but many compilers set lower default levels (e.g. GNU compilers use the default -O0, which turns off all optimizations).
  • Turn off debugging, profiling, and bounds checking when building executables intended for production runs as these can seriously impact performance. These options are all disabled by default. The flag used for bounds checking is compiler dependent, but the debugging (-g) and profiling (-pg) flags tend to be the same for all major compilers.

Back to top

Running Jobs on TSCC

Important Guidelines for Running Jobs

  • Please do not write job output to your home directory ( /home/$USER). NFS filesystems have a single server which handles all the metadata and storage requirements. This means that if a job writes from multiple compute nodes and cores, the load is focused on this one server.
  • The Lustre parallel filesystem (/oasis/tscc/scratch) is optimized for efficient handling of large files, however it doesn't work nearly as well when writing many small files. We recommend using this filesystem only if your metadata load is modest – i.e., you have O(10)-O(200) files open simultaneously. The Lustre parallel filesysetm has a 90-day purge policy, please move your files to a more persistent storage location.
  • Use local scratch $TMPDIR  if your job writes a lot of files from each task. The local scratch filesystem is purged at the end of each job, so you will need to copy out files that you want to retain after the job completes.

Back to top

Running Jobs with TORQUE

TSCC uses the TORQUE Resource Manager (also known by its historical name Portable Batch System, or PBS) with the Maui Cluster Scheduler to define and manage job queues. TORQUE allows the user to submit one or more jobs for execution, using parameters specified in a job script.

Job Queue Characteristics

The default walltime for all queues is now one hour. Max cores has been updated on some queues as well. Max walltimes are still in force per the below list.

The limits are provided for each partition in the table below.

Partition Name Max Walltime Max Processors/User

Max Running + Queued Jobs

Accessible by

Comments
hotel 168 hrs 128 1500 all Regular compute nodes for all users
gpu-hotel 336 hrs -- -- all GPU nodes for all users
pdafm 168 hrs 96 50 all pdafm nodes for all users
home unlimited unlimited 1500 condo Home node(s) for condo participants
condo 8 hrs 512 1500 condo Compute nodes for condo participants
gpu-condo 8 hrs 84 -- condo gpu nodes for condo participants
glean 1 hr 1024 500 condo  pre-emptible nodes for condo participant, free of charge

 

The intended uses for the submit queues are as follows:

  • hotel The hotel queue supports all non-contributor users of TSCC. Jobs submitted to this queue will use only the hotel nodes.
  • home The home queue is a routing queue intended for all submissions to group-specific condo clusters.  Users intending to run only within the nodes their group has contributed, should submit to this queue. Users may belong to more than one home group; in this case, a default will be in effect, and using a non-default group needs to be specified in the job submission.  The home queue does not have a maximum time limit.  The home queue will route jobs to specific queues on the submitter's group membership.   
  • condo The condo queue is exclusive to contributors, but allows jobs to run on nodes in addition to those purchased. This means that a project can use more cores can be used than were contributed by the project.   The queue limits the run time to eight hours to allow the node owners to have access per their contracted agreement.
  • glean The glean queue will allow jobs to run, free of charge, on any idle condo nodes. These jobs will be terminated whenever the other queues receive job submissions that can use the idle nodes. This queue is exclusive to condo participants.

For the hotel, condo, pdafm and home queues, jobs charges are based on the number of cores allocated. Memory is allocated in proportion to the number of cores on the node. 

Queue Usage Policies and Restrictions

All nodes in the system are shared.  The number of cores on a given node dictates the maximum number of jobs that can be run on each.  For example a node with 16 cores can run up to 16 jobs.

Jobs are scheduled by priorty.  Jobs submitted to home queues have higher priority then those submitted to condo or glean.  If the system is sufficiently busy that all available processors are in use and both the home and condo queues have jobs waiting, the home jobs will run first . 

A maximium of 10 jobs per user are eligible for scheduling per scheduler iteration.

All TSCC nodes have slightly less  than the specificed  amount of memory available, due to system overhead. Jobs that attempt to use more than the specified proportion will be killed.

Back to top

Submitting a Job

To submit a script to TORQUE:

qsub <batch_script>

The following is an example of a TORQUE batch script for running an MPI job. The script lines are discussed in the comments that follow.

#!/bin/csh
#PBS -q <queue name>
#PBS -N <job name>
#PBS -l nodes=10:ppn=2
#PBS -l walltime=0:50:00
#PBS -o <output file>
#PBS -e <error file>
#PBS -V
#PBS -M <email address list>
#PBS -m abe
#PBS -A <account>
cd /oasis/tscc/scratch/<user name>
mpirun -v -machinefile $PBS_NODEFILE -np 20 <./mpi.out>

Comments for the above script:

#PBS -q <queue name>

# Specify queue to which job is being submitted, one of:

  • hotel
  • gpu-hotel
  • condo
  • gpu-condo
  • pdafm
  • glean
#PBS -N <job name>

Specify the name of the job.

#PBS -l nodes=10:ppn=2

Request 10 nodes and 2 processors per node.

#PBS -l walltime=0:50:00

Reserve the requested nodes for 50 minutes.

#PBS -o <output file>

Redirect standard output to a file.

#PBS -e <error file>

Redirect standard error to a file.

#PBS -V

Export all user environment variables to the job.

#PBS -M <email address list>

List users, separated by commas, to receive email from this job.

#PBS -m abe

Set of conditions under which the execution server will send email about the job: (a)bort, (b)egin, (e)nd.

#PBS -A <account>

Specify account to be charged for running the job; optional if user has only one account.

If more than one account is available and this line is omitted, job will be charged to default account. To ensure the correct account is charged, it is recommended that the -A option always be used.

cd /oasis/tscc/scratch/<user name>

Change to user's working directory in the Lustre filesystem.

mpirun -v -machinefile $PBS_NODEFILE -np 20 <./mpi.out>

Run as a parallel job, in verbose output mode, using 20 processors, on the nodes specified by the list contained in the file referenced by $PBS_NODEFILE, and send the output to file mpi.out in current working directory.

To reduce email load on the mailservers, please specify an email address in your TORQUE script. For example:

#PBS -M <your_username@ucsd.edu>

#PBS -m mail_options

or using the command line:

qsub -m mail_options -M <your_username@ucsd.edu>

These mail_options are available:

n:

no mail is sent

a:

mail is sent when the job is aborted by the batch system.

b:

mail is sent when the job begins execution.

e:

mail is sent when the job terminates.

Example Scripts for Applications

SDSC User Services staff have developed sample TORQUE scripts for common applications. They are available in the

/projects/builder-group/examples directory on TSCC.

Back to top

GPU Queue Details

The GPU nodes have their own queues, named gpu-hotel and gpu-condo. The GPUs on a GPU node are allocated proportionally to the total number of cores present  e.g. if a GPU node has 12 CPU cores and 4 GPUs, then to access one gpu three cores should be requested. 

In general total number of cores/total number of gpus = number of cores that needs to be requested per GPU. 

To use 1 gpu on a GPU node with 12 cores and 4 GPUs, use a command similar to

# qsub -I -q gpu-hotel -l nodes=1:ppn=3

Allocated GPUs are referenced by the CUDA_VISIBLE_DEVICES environment variable. Applications using the CUDA libraries will discover GPU allocations through that variable.  Users do not need to set CUDA_VISIBLE_DEVICES variable. 

The condo-based GPUs are accessible through the gpu-condo queue, which provides users who have contributed GPU nodes to the cluster with access to each other's nodes. Like the general computing condo queue, jobs submitted to this queue have an 8-hour time limit.

Condo owners can glean cycles on condo GPU node(s) via the general glean queue. To run on gpu nodes in glean queue,  add :gpu to the node resource specification. For example:

# qsub -I -q glean -l nodes=1:ppn=3:gpu

Back to top

Submitting an Interactive Job

In order to run interactive parallel batch jobs on TSCC, use the qsub -I command, which provides a login to the launch node as well as the PBS_NODEFILE file with all nodes assigned to the interactive job.  As with any job, the interactive job will wait in the queue until the specified number of nodes become available. Requesting fewer nodes and shorter wall clock times may reduce the wait time because the job can more easily backfill among larger jobs.

The following is an example of a TORQUE command for running an interactive job with a wall time of 50 minutes using two nodes and ten processors per node.

qsub -I -l nodes=2:ppn=10 -l walltime=0:50:00

 

The showbf command gives information on available time slots. This provides an accurate prediction of when the submitted job will be allowed to run.

The standard input, output, and error streams of the job are connected through qsub to the terminal session in which qsub is running. Use the exit command to end an interactive job.

To use a Graphical User Interface (GUI) as part of your interactive job, you will need to set up Xforwarding. 

For example when using XQuartz on a MAC,  the following additional steps need to be taken to display the GUI:

  1. Set X11 forwarding on the computer that you are connecting from.
  2. Login to TSCC using ssh –Y username@tscc-login.sdsc.edu 
  3. Submit the interactive job using –X option; for example          

        qsub -X -I –l nodes=2:ppn=10 –l walltime=0:50:00

Back to top

Examples

To run an interactive job with a wall clock limit of 30 minutes using two nodes and two processors per node:

$ qsub -I -V -l walltime=00:30:00 -l nodes=2:ppn=2
qsub: waiting for job 75.tscc-login.sdsc.edu to start
qsub: job 75.tscc-login.sdsc.edu ready
$ echo $PBS_NODEFILE
/opt/torque/aux/75.tscc-login.sdsc.edu
$ more /opt/torque/aux/75.tscc-login.sdsc.edu
compute-0-31
compute-0-31
compute-0-25
compute-0-25
$ mpirun -machinefile /opt/torque/aux/75.triton-42.sdsc.edu -np 4 <hostname>
compute-0-25.local
compute-0-25.local
compute-0-31.local
compute-0-31.local</hostname>

To run an interactive job with a wall clock limit of 30 minutes using two PDAF nodes and 32 processors per node:

qsub -I -q pdafm -l walltime=00:30:00 -l nodes=2:ppn=32

The following is an example of a TORQUE command for running an interactive job.

qsub -I -l nodes=10:ppn=2 -l walltime=0:50:00

The standard input, output, and error streams of the job are connected through qsub to the terminal session in which qsub is running.

Back to top

Monitoring the Batch Queues

Useful TORQUE Commands
Command Description
qstat -a Display the status of batch jobs
qdel <pbs_jobid> Delete (cancel) a queued job
qstat -r Show all running jobs on system
qstat -f <pbs_jobid> Show detailed information of the specified job
qstat -q Show all queues on system
qstat -Q Show queues limits for all queues
qstat -B Show quick information of the server
pbsnodes -a Show node status

*View the qstat manpage for more options.

Users can monitor batch queues using the following commands:

qstat: List jobs in the queue

$qstat
Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
90.tscc-46                PBStest          hocks                  0 R hotel
91.tscc-46                PBStest          hocks                  0 Q hotel
92.tscc-46                PBStest          hocks                  0 Q hotel

 

showq: Lists job information

$showq
active jobs------------------------
JOBID              USERNAME      STATE PROCS   REMAINING            STARTTIME
94                    hocks    Running     8    00:09:53  Fri Apr  3 13:40:43
1 active job               8 of 16 processors in use by local jobs (50.00%)
                            8 of 8 nodes active      (100.00%)

eligible jobs----------------------
JOBID              USERNAME      STATE PROCS     WCLIMIT              QUEUETIME
95                    hocks       Idle     8    00:10:00  Fri Apr  3  13:40:04
96                    hocks       Idle     8    00:10:00  Fri Apr  3  13:40:05
2 eligible jobs

blocked jobs-----------------------
JOBID              USERNAME      STATE PROCS     WCLIMIT             QUEUETIME
0 blocked jobs
Total jobs:  3

showbf: Shows backfill opportunities (Users who are trying to choose parameters that allow their jobs to run more quickly may find this a convenient way to find open nodes and time slots)

$showbf
Partition     Tasks  Nodes      Duration   StartOffset       StartDate
---------     -----  -----  ------------  ------------  --------------
ALL               8      8      INFINITY      00:00:00  13:45:30_04/03


yqd: Shows job information (diagnose why submitted jobs remain queued)

Usage:  yqd --user=id --queue=queue

$ yqd 13977625 13977625 (xxxxxx home-YXX 1x24 ['haswell'] 0:12:23): 0 nodes free

$ yqd 13976617
13976617 (xxxxx home-abc 1x3 ['gpu1080ti'] 4:48:23): No nodes available

lsjobs:  Report the users and jobs running on each compute node

Usage: lsjobs --propoery=xxx-node
$ lsjobs --property=xxx-node 
tscc-0-40: 13969396/yyyyy/47:00:44/home-YXXx24 
tscc-0-41: 13969434/yyyyy/50:23:20/home-YXXx24

 

checkjob:  Shows the current status of the jobs

Usage: checkjob job_identifier
$ checkjob 13976617
>Reservation '13976617' (9:19:34:24 -> 12:19:34:24  Duration: 
>3:00:00:00)
>PE:  3.00  StartPriority:  203174
>job cannot run in partition DEFAULT (idle procs do not meet 
>requirements
>: 0 of 3 procs found)
>idle procs: 3995  feasible procs:   0
>....

Back to top

Usage Monitoring

The following commands will report the current account balances:

gbalance -u <user_name>

or

gbalance -p <group_name>

The following commands will generate a summary of all activity on the account (** PLEASE be sure to always include the start and end time as paraters) :

gstatement -u <user_name> -s 2019-01-01 -e 2019-02-28

gstatement -p <group_name> -s 2019-01-01 -e 2019-02-28

Obtaining Support for TSCC Jobs

For any questions, please send email to tscc-support@ucsd.edu.

Back to top

Debugging with DDT

The following describes a basic setup of a DDT debugging session. The procedure diverges between steps 9 & 10, where the option to start a job from the queue using the debugger or to attach to an already running process is presented.

Step-by-Step Tutorial

Follow this procedure to learn debugging on TSCC using DDT

  1. Login to TSCC with X11 forwarding turned on (-X option to ssh command)
  2. Run this command to set up your environment:

    module load ddt

    This is equivalent to putting this line in your .bash_profile file:

    export DDT_LICENSE_FILE=/opt/ddt/License.client

    and then running this command to reload the current shell environment:

    . ~/.bash_profile
  3. Make sure your code is compiled with optimization turned off by compiling with -O0 (that is capital letter "O" followed by number zero), and symbol table information enabled by compiling with the -g option
  4. Run this command to start the DDT client:

    /opt/ddt/bin/ddt
  5. To start a debugging session, from the "Session" menu select "New Session" and then "Run" from the submenu.
    Screenshot of New Session->Run menu selection
  6. In the "Run" window, enter the full path to the executable in the "Application" field and any command line arguments in the "Arguments" field.
    Screenshot of Application Path selection
  7. In the "Run" window, click the "Change" button to bring up the Options window. On the "System Settings" tab, select the proper "MPI Implementation", or select "generic" if you encounter a problem while debugging. Select "none" to debug a serial or non-mpi code.
    Screenshot of Options->System tab edit
  8. If you are running an interactive debugging job or plan to attach to a running job, specify a hosts file in the "Attach hosts file" field and add host names to that file, one line per host.
  9. To submit a job for debugging through the queue:
    1. Still in the "Options" window, click on the "Job Submission" tab.
      Screenshot of Options->Job Submission tab edit
    2. Check "Submit job through queue or configure own 'mpirun' command".
    3. To use the predefined template for a PBS (TORQUE) job, click browse (the folder icon) and select the /opt/ddt/templates/pbs.qtf file.
    4. In the "Submit command" field, enter qsub. Leave the "Regexp for job id" field blank. In the "Cancel command" field, enter qdel. In the "Display command" field, enter qstat.
    5. Check the "NUM_NODES_TAG and PROCS_PER_NODE" box, and enter "8" in the "PROC_PER_NODE_TAG" field (standard for the batch queue).
    6. Click on the "Edit Queue Submission Parameters..." button to bring up the "Queue Submission Parameters" window.
      Screenshot of Options->Queue Submission Parameters edit
    7. Enter the wall clock time limit (in HH:MM:SS format) and the queue name. If you are using mpi, enter the full path to mpirun.
    8. Click "OK" to return to the Options window, then click OK in the Options window to return to the Run window.
    9. In the Run window, select the number of nodes you want to allocate and the select the "Submit" button.
      Screenshot of Run->Number of Nodes edit
    10. Wait for the jobs to start...
      Screenshot of submitted job wait dialog
    11. Enjoy!
      Screenshot of running debug session
  10. To attach to a running job:
    1. from the "Session" menu, select "New session" and then "Attach" from the submenu.
    2. In the field "Filter for process names containing" enter the name of the executable (just the name is sufficient, do not enter the full path).
    3. Based on the the host names in the host file (see step 9), DDT will scan the specified hosts for processes with the given name and attempt to attach to them. If you have submitted a job to the queue, obtain the host list from (for example) checkjob <job number>.

Note:

You can download the DDT User Guide from the Allinea web site.

Back to top

Bundling Serial Jobs

How to submit multiple serial jobs with a single script

Occasionally, a group of serial jobs need to be run on TSCC. Rather than submit each job individually, they can be grouped together and submitted using a single batch script procedure such as the one described below.

Back to top

Overview

Although it's preferable to run parallel codes whenever possible, sometimes that is not cost-effective, or the tasks are simply not parallelizable. In that case, using a procedure like this can save time and effort by organizing multiple serial jobs into a single input file and submitting them all in one step.

The code for this process is given below in a very simple example that uses basic shell commands as the serial tasks. Your complex serial tasks can easily be substituted for those commands and submitted using a modified version of these scripts and run from your own home directory.

Note that the /home filesystem on TSCC uses autofs. Under autofs, filesystems are not always visible to the ls command. If you cd to the/home/beta directory, for example, it will get mounted and become accessible. You can also reference it explicitly, e.g. ls /home/beta, to verify its availability. Autofs is used to minimize the number of mounts visible to active nodes. All users have their own filesystem for their home directory.

The components used in this example include:

  • submit.qsub (batch script to run to submit the job)
  • my_script.pl (perl script invoked by batch job)
  • jobs-list (input to perl script with names of serial jobs)
  • getid (executable to obtain the processor number, called by perl script)

Back to top

Example Batch File

The following is an example script that can be modified to suit users with similar needs. This file is named submit.qsub.

#!/bin/sh 
# 
#PBS -q hotel 
#PBS -m e 
#PBS -o outfile 
#PBS -e errfile 
#PBS -V
  
################################################################### 
### Update the below variables with correct values  

### Name your job here again 
#PBS -N jobname  

### Put your node count and time here 
#PBS -l nodes=1:np=5 
#PBS -l walltime=00:10:00  

### Put your notification E-mail ID here 
#PBS -M username@some.domain  

### Set this to the working directory 
cd /home/beta/scripts/bundling  

####################################################################  

## Run my parallel job 
/opt/openmpi_pgimx/bin/mpirun -machinefile $PBS_NODEFILE -np 5 \ 
./my_script.pl jobs-list

Back to top

Example Script and Input Files

The above batch script refers to this script file, named my_script.pl.

#!/usr/bin/perl 
# 
# This script executes a command from a list of files, 
# based on the current MPI id. 
# 
# Last modified: Mar/11/2005 
#  

# call getid to get the MPI id number  

($myid,$numprocs) = split(/\s+/,`./getid`); 
$file_id = $myid; 
$file_id++;   

# open file and execute appropriate command  
$file_to_use = $ARGV[0]; 
open (INPUT_FILE, $file_to_use) or &showhelp;  

for ($i = 1; $i <= $file_id; $i++) { 
    $buf = <INPUT_FILE>; 
}  

system("$buf");  

close INPUT_FILE;   

sub showhelp { 
        print "\nUsage: mpiscript.pl <filename>\n\n"; 
        print "<filename> should contain a list of executables,"; 
        print " one-per-line, including the path.\n\n"; 
}
</filename></filename>

The batch script refers to this input file, named jobs-list.

hostname; date 
hostname; ls 
hostname; uptime 
uptime 
uptime > line-5

Back to top

Sample Output

Running the above script writes output like this to the file <outfile>. Notice that the output lines are non-sequential and may be written to the same file.

12:20:53 up 3 days, 5:41, 0 users, load average: 0.92, 1.00, 0.99 
compute-0-51.local 
compute-0-51.local 
Wed Aug 19 12:20:53 PDT 2009 
12:20:53 up 3 days, 5:41, 0 users, load average: 0.92, 1.00, 0.99 
compute-0-51.local 
getid getid.c jobs-list line-5 my_script.pl submit.qsub

Line 5 in the above script writes output like this to the file <line-5>. This output does not appear in the shared output file.

12:20:53 up 3 days, 5:41, 0 users, load average: 0.92, 1.00, 0.99

Back to top

Summary and Potential Other Uses

A modification of this procedure is available from TSCC User Support (member-only list) that matches the number of scripts to the number of processors, when more scripts are being run than processors are available.

It should also be possible to modify this script to run parallel jobs. Feel free to try it or ask support for help through the TSCC Discussion List.

Back to top

TSCC Software

Installed and Supported Software

The TSCC runs CentOS version 6.3.  Over 50 additional software applications and libraries are installed on the system, and system administrators regularly work with researchers to extend this set as time/costs allow.

Back to top

Applications Software

This list is subject to change. Specifics of installed location, version and other details may change as the packages are updated. Please contact tscc-support@ucsd.edu for details.

Package Topic Area License Type Package Home Page User Install Location Installed on:
(L)ogin,
(C)ompute,
(B)oth
bbcp Data Transfer Private License bbcp Home Page /opt/bbcp B
ATLAS (Automatically Tuned Linear Algebra Software) Mathematics BSD ATLAS Home Page /opt/atlas B
AMOS (A Modular, Open-Source whole genome assembler) Genomics OSI AMOS Home Page /opt/amos B
Amber (Assisted Model Building with Energy Refinement) Molecular Dynamics Amber 12 Software License Amber Home Page /opt/amber C
ABySS (Assembly by Short Sequences) Genomics BC Cancer Agency academic ABySS Home Page /opt/abyss B
APBS (Adaptive Poisson-Boltzmann Solver) Bioinformatics BSD, MIT APBS Home Page /opt/apbs C
BEAST Bioinformatics, Phylogenetics GNU LGPL BEAST Home Page /opt/beast C
BLAT Bioinformatics, Genetics Free Non-commercial BLAT User Guide /opt/biotools/blat B
bbFTP Large Parallel File Transfer GNU GPL bbFTP Home Page /opt/bbftp B
Boost C++ Library Boost Software License Boost Home Page /opt/boost B
Bowtie Short Read Aligner Bioinformatics GNU GPL Bowtie Home Page /opt/biotools/bowtie B
Burrows-Wheeler Aligner (BWA) Bioinformatics GNU GPL BWA Home Page /opt/biotools/bwa B
Cilk Parallel Programming GNU GPL Cilk Home Page /opt/cilk B
CPMD Molecular Dynamics Free for noncommercial research CPMD Home Page /opt/cpmd/bin B
DDT Graphical Parallel Debugger Licensed DDT Home Page /opt/ddt B
Package Topic Area License Type Package Home Page User Install Location Installed on:
(L)ogin,
(C)ompute,
(B)oth
FFTW General GNU GPL FFTW Home Page /opt/fftw B
FPMPI MPI Programming Licenced FPMPI Home Page /opt/mvapich2
–and–
/opt/openmpi
B
FSA (Fast Statistical Alignment) Genetics Licensed FSA Home Page /opt/fsa TBD
GAMESS Chemistry No-cost Site License GAMESS Home Page /opt/gamess B
Gaussian Structure Modeling Commercial Gaussian Home Page /opt/gaussian B
Genome Analysis Toolkit (GATK) Bioinformatics BSD Open Source GATK Home Page /opt/biotools/GenomeAnalysisTK B
GROMACS (Groningen Machine for Chemical Simulations) Molecular Dynamics GNU GPL Gromacs Home Page /opt/gromacs B
GSL (GNU Scientific Library) C/C++ Library GNU GPL GSL Home Page /opt/gsl B
IDL Visualization Licensed IDL Home Page /opt/idl B
IPython Parallel Computing BSD IPython Home Page /opt/ipython B
JAGS (Just Another Gibbs Sampler) Statistical Analysis GNU GPL, MIT JAGS Home Page /opt/jags B
LAPACK (Linear Algebra PACKage) Mathematics BSD LAPACK Home Page /opt/lapack B
matplotlib Python Graphing Library PSF (Python Software Foundation) matplotlib Home Page /opt/scipy/lib64/python2.4/site-packages B
LAMMPS (Large-scale Atomic/Molecular Massively Parallel Simulator) Molecular Dynamics Simulator GPL LAMMPS Home Page /opt/lammps C
MATLAB Parallel Development Environment Licensed MATLAB Home Page /opt/matlab.2011a
–and–
/opt/matlab_server_2012b
B
Package Topic Area License Type Package Home Page User Install Location Installed on:
(L)ogin,
(C)ompute,
(B)oth
Mono Development Framework MIT, GNU GPL Mono Home Page /opt/mono B
MUMmer Bioinformatics Artistic License MUMmer Home Page /opt/mummer B
mxml (Mini-XML) Bioinformatics GNU GPL Mini-XML Home Page /opt/mxml B
Octave Numerical Computation GNU GPL Octave Home Page /opt/octave B
openmpi Parallel Library Generic Open MPI Home Page /opt/openmpi/intel/mx
–or–
/opt/openmpi/pgi/mx
–or–
/opt/openmpi/gnu/mx
B
NAMD Molecular Dynamics, BioInformatics Non-Exclusive, Non-Commercial Use NAMD Home Page /opt/namd C
NCO NetCDF Support Generic NCO Home Page /opt/nco/intel
–or–
/opt/nco/pgi
–or–
/opt/nco/gnu
B
NetCDF General Licensed (free) NetCDF Home Environment Module B
NoSE (Network Simulation Environment) Networking GNU GPL NoSE Home Perl Module B
NumPy (Numerical Python) Scientific Calculation BSD NumPy Home /opt/scipy/lib64/python2.4/site-packages B
NWChem Chemistry EMSL (free) NWChem Home Page /opt/nwchem B
ParMETIS (Parallel Graph Partitioning and Fill-reducing Matrix Ordering) Numerical Computation Open ParMETIS Home Page /opt/parmetis B
PDT (Program Database Toolkit) Software Engineering Open PDT Home Page /opt/pdt B
PETSc (Portable, Extensible Toolkit for Scientific Computation) Mathematics Open PETSc Home Page /opt/petsc B
PyFITS Astrophysics BSD PyFITS Home Page /opt/scipy/lib64/python2.4/site-packages B
Python General Scripting BSD Python Home Page /opt/python/bin B
pytz Python TimeZone Module MIT PyTZ Home Page /opt/scipy/lib64/python2.4/site-packages B
R Statistical Computing and Graphics GNU GPL R Home Page /opt/R B
RapidMiner Data Mining Open Source under AGPL3 RapidMiner Home Page /opt/rapidminer B
SAMTools (Sequence Alignment/Map) Bioinformatics BSD, MIT SAMtools Home Page /opt/biotools/samtools B
ScaLAPACK (Scalable Linear Algebra PACKage) Mathematics modified BSD ScaLAPACK Home Page /opt/scalapack B
SciPy (Scientific Python) Scientific Computing BSD SciPy Home Page /opt/scipy B
SIESTA Molecular Dynamics SIESTA LICENCE for COMPUTER CENTRES SIESTA Home Page /opt/siesta B
SOAPdenovo (Short Oligonucleotide Analysis Package) Bioinformatics GNU GPLv3 SOAPdenovo Home Page /opt/biotools/soapdenovo B
SPRNG (The Scalable Parallel Random Number Generators Library) General Scientific None SPRNG Home Page /opt/sprng B
SuperLU Mathematics Regents of the University of California SuperLU Home Page /opt/superlu B
TAU Tuning and Analysis Utilities GNU GPL TAU Home Page /opt/tau/intel
–or–
/opt/tau/pgi
–or–
/opt/tau/gnu
B
Tecplot Simulation Analytics Tecplot License Tecplot Home Page /opt/tecplot B
Trilinos Software Engineering BSD and LGPL Trilinos Home Page /opt/trilinos B
VASP (Vienna Ab initio Simulation Package) Molecular Dynamics University of Vienna VASP Home Page /opt/vasp B
Velvet (Short read de novo assembler using de Bruijn graphs) Bioinformatics GNU GPL Velvet Home Page /opt/biotools/velvet B
WEKA (Waikato Environment for Knowledge Analysis) Data Mining GNU GPL WEKA Home Page /opt/weka B
Package Topic Area License Type Package Home Page User Install Location Installed on:
(L)ogin,
(C)ompute,
(B)oth

Back to top

System Software

System Software

Package Topic Area License Type Package Home Page User Install Location Installed on:
(L)ogin,
(C)ompute,
(B)oth
CentOS Operating System Open Source CentOS Home Page N/A B
Ganglia N/A Open Source Ganglia Home Page /opt/ganglia B
Gold Allocation Manager Open Source Gold Home Page /opt/gold L
Hadoop Distributed Processing Framework Apache License Hadoop Home Page /opt/hadoop B
HDF4 (Hierarchical Data Format) Data Management BSD License HDF4 Home Page /opt/hdf4 B
HDF5 (Hierarchical Data Format) Data Management BSD License HDF5 Home Page /opt/hdf5 B
IPM (Integrated Performance Modeling) Profiling Free IPM Home Page /opt/ipm B
Lustre Scalable File System GNU GPL Lustre Home Page /opt/lustre B
Maui Workload Scheduler GNU Lesser GPL Maui Scheduler SourceForge Page /opt/maui L
Modules Environment Variable Management GNU GPL Modules Home Page /opt/modules B
mvapich2 Message Passing Interface Open Source MVAPICH2 Home Page /opt/mvapich2/gnu/ib/bin
–or–
/opt/mvapich2/intel/ib/bin
–or–
/opt/mvapich2/pgi/ib/bin
B
myHadoop System Administration BSD myHadoop Home Page /opt/myhadoop B
Nagios System Monitor GNU GPL Nagios Home Page /opt/nagios Management Node Only
Nagios Plugins System Monitor GNU GPL Nagios Plugins Home Page /opt/nagios Management Node Only
NSCA (Nagios Service Check Acceptor) System Monitor GNU GPL NSCA Home Page /opt/nagios B
PAPI (Performance Application Programming Interface) Performance Monitor Unknown PAPI Home Page /opt/papi B
TORQUE Resource Manager Open Source TORQUE Home Page /opt/torque B
Package Topic Area License Type Package Home Page User Install Location Installed on:
(L)ogin,
(C)ompute,
(B)oth

Back to top

Compilers

Compilers

Package Topic Area License Type Package Home Page User Install Location Installed on:
(L)ogin,
(C)ompute,
(B)oth
CMake Cross Platform Makefile Generator Open CMake Home Page /opt/cmake B
GNU Compilers C and Fortran Compilers CentOS Core License GNU Compiler Collection Home Page /usr/bin/gcc
–and–
/usr/bin/gfortran
B
Intel Compilers C and Fortran Compilers Licensed (flexlm) Intel Compilers Home Page /opt/intel B
Java Compiler Open Java Home Page /usr/bin/javac B
PGI Compilers C and Fortran Compilers Licensed (flexlm) PGI Compilers Home Page /opt/pgi B
UPC (Unified Parallel C) Compiler Parallel Computing BSD UPC Home Page /opt/upc B

Requesting Additional Software

Users can install software in their home directories. If interest is shared with other users, requested installations can become part of the core software repository. Please submit new software requests to tscc-support@ucsd.edu.

Back to top