TSCC User Guide

Technical Summary

The Triton Shared Computing Cluster (TSCC) provides central colocation and systems administration for your cluster nodes (“condo cluster”), as well as providing a hotel service for those with temporary or bursty high-performance computing (HPC) needs. TSCC provides three kinds of compute nodes in the cluster: General Computing Nodes, GPU Nodes, and Petascale Data Analysis Facility (PDAF) Nodes.

TSCC is now open to new purchases with special vendor pricing. See the Purchase Info page.

Also see the TSCC Quick Start Guide.

Interactive Parallel Batch Jobs

Commands

In order to run interactive parallel batch jobs on TSCC, use the qsub -Icommand, which provides a login to the launch node as well as the PBS_NODEFILE file with all nodes assigned to the interactive job.

Other qsub options can be used, such as those described by the man qsub command.

As with any job, the interactive job will wait in the queue until the specified number of nodes become available. Requesting fewer nodes and shorter wall clock times may reduce the wait time because the job can more easily backfill among larger jobs.

The showbf command gives information on available time slots. This provides an accurate prediction of when the submitted job will be allowed to run.

Use the exit command to end an interactive job.

Back to top

Examples

To run an interactive job with a wall clock limit of 30 minutes using two nodes and two processors per node:

$ qsub -I -V -l walltime=00:30:00 -l nodes=2:ppn=2
qsub: waiting for job 75.tscc-login.sdsc.edu to start
qsub: job 75.tscc-login.sdsc.edu ready
$ echo $PBS_NODEFILE
/opt/torque/aux/75.tscc-login.sdsc.edu
$ more /opt/torque/aux/75.tscc-login.sdsc.edu
compute-0-31
compute-0-31
compute-0-25
compute-0-25
$ mpirun -machinefile /opt/torque/aux/75.triton-42.sdsc.edu -np 4 <hostname>
compute-0-25.local
compute-0-25.local
compute-0-31.local
compute-0-31.local</hostname>

To run an interactive job with a wall clock limit of 30 minutes using two PDAF nodes and 32 processors per node:

qsub -I -q pdafm -l walltime=00:30:00 -l nodes=2:ppn=32

Environment Modules

Managing Your Shell Environment

TSCC uses the Environment Modules package to control user environment settings. Below is a brief discussion of its common usage. You can learn more at the Modules home page.

Overview

The Environment Modules package provides for dynamic modification of a shell environment. Module commands set, change, or delete environment variables, typically in support of a particular application. They also let the user choose between different versions of the same software or different combinations of related codes.

For example, if the pgi module and openmpi_ib module are loaded and the user compiles with mpif90, the generated code is compiled with the Portland Group Fortran 90 compiler and MPI libraries utilizing openmpi are linked. By unloading the openmpi_ib module, loading the mvapich2_ib module, and compiling with mpif90, the Portland compiler is used but linked with mvapich2 support.

Default Modules

Several modules that determine the default TSCC environment are loaded at login time. These include the intel and openmpi_ib modules to set the default compiler environment. Default modules are noted below in the complete list.

Back to top

Useful Modules Commands

Here are some common module commands and their descriptions:

  • module list - List the modules that are currently loaded
  • module avail - List the modules that are available
  • module display <module_name> - Show the environment variables used by <module_name> and how they are affected
  • module unload <module_name> - Remove <module_name> from the environment
  • module load <module_name> - Load <module_name> into the environment
  • module switch <module_1_name> <module_2_name> - Replace <module_1_name> with <module_2_name> in the environment

Note that the order in which modules are loaded is significant. For example, if the pgi module is loaded and subsequently the intel module is loaded, the intel compiler will be used. Also, some modules depend on others so may be loaded or unloaded as a consequence of another module command. For example, if intel and mvapich2_ib modules are both loaded, running the command unload intel will automatically unload mvapich2_ib. Subsequently issuing the load intel command will not automatically reload mvapich2_ib.

Complete documentation is available in the module(1) and modulefile(4) manpages.

Available Modules

The following modules are available on TSCC:

Modules located in /opt/modulefiles/mpi/.pgi:

  • mvapich2_ib
  • openmpi_ib

Modules located in /opt/modulefiles/applications/.pgi:

  • fftw/2.1.5
  • fftw
  • gsl
  • hdf4
  • hdf5
  • lapack
  • netcdf/3.6.2
  • netcdf
  • parmetis
  • scalapack
  • sprng
  • sundials
  • superlu
  • trilinos

Modules located in /opt/modulefiles/applications:

  • abyss
  • amber
  • bbcp
  • bbftp
  • beast
  • bioroll
  • biotools
  • cpmd
  • ddt
  • fsa
  • gamess
  • gaussian
  • idl
  • jags
  • matlab
  • nwchem
  • octave
  • R
  • rapidminer
  • scipy
  • siesta
  • tecplot
  • vasp/4.6
  • vasp
  • weka

Modules located in /opt/modulefiles/compilers:

  • cilk
  • cmake
  • gnu
  • intel
  • mono
  • pgi
  • upc

Modules located in /usr/share/Modules/modulefiles:

  • dot
  • module-git
  • module-info
  • modules
  • null
  • rocks-openmpi
  • rocks-openmpi_ib
  • use.own

Modules located in /etc/modulefiles:

  • openmpi-x86_64

Back to top

Compiling Codes

Serial and MPI Codes

This page describes how to compile codes for the Triton Shared Computing Cluster nodes, including the Petascale Data Analysis Facility (PDAF) nodes.

Porting Existing Codes

If you have an existing serial or MPI-based parallel application program already running on a distributed-memory platform:

Copy your application source files to your $HOME directory or to a Data Oasis area for your account:

/oasis/tscc/scratch/<username> where <username> is your TSCC login name.

Note that Data Oasis storage is not backed up and files stored on this system may be lost or destroyed without recourse to restore them. Long-term file storage should be maintained in your $HOME directory or project storage.

Compiling MPI Codes

MPI source code should be recompiled for TSCC using the following default compiler commands:

  • mpicc [options] file.c (C and C++) myrinet/mx switch and the Portland Compiler
  • mpif77 [options] file.f (Fortran 77 source code) myrinet/mx switch and the Portland Compiler
  • mpif90 [options] file.f (free format code/dynamic memory allocation/object oriented Fortran source code) openmpi and the Intel compiler

Other MPI stack/compiler combinations may be obtained by choosing the appropriate modules. The choices include:

InfiniBand
  • PGI compilers + openmpi_ib
  • PGI compilers + mvapich2_ib
  • Intel compilers + openmpi_ib
  • Intel compilers + mvapich2_ib
  • GNU compilers + openmpi_ib
  • GNU compilers + mvapich2_ib
Compiling Serial Codes

Serial source code should be recompiled for the TSCC system with the following compiler commands:

Portland Group Compilers
  • pgcc [options] file.c (C and C++)
  • pgf77 [options] file.f (fixed form Fortran source code)
  • pgf90 [options] file.f90 (free format Fortran source code)
Intel Compilers
  • icc [options] file.c (C and C++)
  • ifort [options] file.f (fixed form Fortran source code)
  • ifort [options] file.f90 (free format Fortran source code)
Gnu Compilers
  • gcc [options] file.c (C and C++)
  • gfortran [options] file.f90 (free format Fortran source code)
Compatibility Options for Fortran

  • -Mcpp run the Fortran preprocessor on source files prior to compilation
  • -Dname[=value]specify name as a definition to use with conditional compilation directives or the Fortran preprocessor (-Mcpp)
  • -silent/-w/-Minform=severe suppress messages about use of non-standard Fortran
  • -byteswapio swap bytes from big-endian to little-endian or vice-versa for unformatted files
  • -i8 set size of INTEGER and LOGICAL variables to 8 bytes
  • -i4 set size of INTEGER and LOGICAL variables to 4 bytes
  • -i2 set size of INTEGER variables to 2 bytes
  • -r8 treat REAL and CMPLX types as REAL*8 and DCMPLX
  • -Msave save all local variables between calls (static allocation)
Detecting Programming Errors
  • -g produce symbolic debug information in object file (implies -O0); required for debugging with DDT
  • -C/-Mbounds array bounds checking
  • -kTrap=(option, option...) specifies behavior on floating point exceptions (used only for main program):
    • dnorm trap on denormalized (very, very small) operands
    • divz trap on divide by zero
    • fp trap on floating point exceptions
    • inexact trap on inexact result
    • inv trap on invalid operands
    • none (default) disables all traps
    • ovf trap on floating point overflow
    • unf trap on floating point underflow
  • -traceback add debug information
Optimization

Optimization levels of the Portland Group compilers are:

  • -O0 No optimization
  • -O1 (default) Task scheduling within extended basic blocks is performed. Some register allocation; no global optimizations.
  • -O2 all level 1 optimizations and global scalar optimizations (optimization over all blocks)
  • -O3/O4 all level 1 and 2 optimizations as well as more agressive optimizations
  • -Mflushz flush very, very small values to zero.

Alternatively, the Portland Compiler User Guide recommends the use of the -fast flag. This option is host dependent and typically has the following effects:

  • set the optimization level at -O2
  • unroll loops (-Munroll=c:1)
  • do not generate code to set up a stack frame pointer for every function (-Mnoframe)
  • enable loop redundancy elimination
Numerical Libraries

The Portland Group compilers come with the Optimized ACML library (LAPACK/BLAS/FFT).

To link:

pg90/pgf77 myprog.f -llapack -lblas

Intel has developed Math Kernel Library (MKL) which contains many linear algebra, FFT and other useful numerical routines.

  • Basic linear algebra subprograms (BLAS) with additional sparse routines
  • Fast Fourier Transforms (FFT) in 1 and 2 dimensions, complex and real
  • The linear algebra package, LAPACK
  • A C interface to BLAS
  • Vector Math Library (VML)
  • Vector Statistical Library (VSL)
  • Multi-dimensional Discrete Fourier Transforms (DFTs)

Documentation is available in HTML and PDF formats in ${MKL_ROOT}/../Documentation.

MKL Link Libraries

To link the MKL libraries, please refer to the Intel MKL Link Line Advisor Web page. This tool accepts inputs for several variables based on your environment and automatically generates a link line for you.

When using the output generated by this site, substitute the TSCC path of the Intel MKL for the value $MKLPATH in the generated script. That value is ${MKL_ROOT}/lib/em64t.

All third-party applications can be found in /opt, or view the complete description of TSCC software packages in the TSCC Quick Start Guide.

Compiling Parallel Codes

The default compiler/mpi stack combination is the Intel compiler (ifort, icc) and openmpi_ib. Other compilers and mpi variants may be accessed by loading the appropriate modules. The commands mpicc, mpiCC, mpif77 and mpif90 will access a particular compiler/mpi combination based on the module choices. Read about Modules to learn how TSCC manages compiler configurations.

A simple MPI C program is given below (mpi_c.c):

Back to top

Parallel Example with C

mpi_c.c
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
main(int argc,char *argv[])
{
   int myproc,numproc;
   MPI_Status status;

   MPI_Init(&argc,&argv);

   MPI_Comm_rank(MPI_COMM_WORLD,&myproc);
   MPI_Comm_size(MPI_COMM_WORLD,&numproc);

   if(myproc == 0)
        printf("NUMPROC %d\n",numproc);
   printf("Process %d\n running",myproc);
   MPI_Finalize();
}

This program can be compiled with the following command:

mpicc -o mpi_c mpic.c

The program will print the total number of MPI processes initiated and a message printed by each process.

Back to top

Parallel Example with Fortran

mpi_f.f

Following is the Fortran equivalent of the above C program:

program mpi
implicit double precision (a-h,o-z)
include "mpif.h"

call mpi_init(ierror)

call mpi_comm_rank(MPI_COMM_WORLD,myproc,ierror)
call mpi_comm_size(MPI_COMM_WORLD,numproc,ierror)
if(myproc.eq.0) write(6,*) 'NUMPROC ',numproc
write(6,*) 'Process ',myproc,' running'
call mpi_finalize();
end

and can be compiled with this command:

mpif77 -o mpi_f mpi_f.f

More information about compiling can be found on the Compiling Jobs section.

Back to top

Debugging with DDT

The following describes a basic setup of a DDT debugging session. The procedure diverges between steps 9 & 10, where the option to start a job from the queue using the debugger or to attach to an already running process is presented.

Step-by-Step Tutorial

Follow this procedure to learn debugging on TSCC using DDT

  1. Login to TSCC with X11 forwarding turned on (-X option to ssh command)
  2. Run this command to set up your environment:

    module load ddt

    This is equivalent to putting this line in your .bash_profile file:

    export DDT_LICENSE_FILE=/opt/ddt/License.client

    and then running this command to reload the current shell environment:

    . ~/.bash_profile
  3. Make sure your code is compiled with optimization turned off by compiling with -O0 (that is capital letter "O" followed by number zero), and symbol table information enabled by compiling with the -g option
  4. Run this command to start the DDT client:

    /opt/ddt/bin/ddt
  5. To start a debugging session, from the "Session" menu select "New Session" and then "Run" from the submenu.
    Screenshot of New Session->Run menu selection
  6. In the "Run" window, enter the full path to the executable in the "Application" field and any command line arguments in the "Arguments" field.
    Screenshot of Application Path selection
  7. In the "Run" window, click the "Change" button to bring up the Options window. On the "System Settings" tab, select the proper "MPI Implementation", or select "generic" if you encounter a problem while debugging. Select "none" to debug a serial or non-mpi code.
    Screenshot of Options->System tab edit
  8. If you are running an interactive debugging job or plan to attach to a running job, specify a hosts file in the "Attach hosts file" field and add host names to that file, one line per host.
  9. To submit a job for debugging through the queue:
    1. Still in the "Options" window, click on the "Job Submission" tab.
      Screenshot of Options->Job Submission tab edit
    2. Check "Submit job through queue or configure own 'mpirun' command".
    3. To use the predefined template for a PBS (TORQUE) job, click browse (the folder icon) and select the /opt/ddt/templates/pbs.qtf file.
    4. In the "Submit command" field, enter qsub. Leave the "Regexp for job id" field blank. In the "Cancel command" field, enter qdel. In the "Display command" field, enter qstat.
    5. Check the "NUM_NODES_TAG and PROCS_PER_NODE" box, and enter "8" in the "PROC_PER_NODE_TAG" field (standard for the batch queue).
    6. Click on the "Edit Queue Submission Parameters..." button to bring up the "Queue Submission Parameters" window.
      Screenshot of Options->Queue Submission Parameters edit
    7. Enter the wall clock time limit (in HH:MM:SS format) and the queue name. If you are using mpi, enter the full path to mpirun.
    8. Click "OK" to return to the Options window, then click OK in the Options window to return to the Run window.
    9. In the Run window, select the number of nodes you want to allocate and the select the "Submit" button.
      Screenshot of Run->Number of Nodes edit
    10. Wait for the jobs to start...
      Screenshot of submitted job wait dialog
    11. Enjoy!
      Screenshot of running debug session
  10. To attach to a running job:
    1. from the "Session" menu, select "New session" and then "Attach" from the submenu.
    2. In the field "Filter for process names containing" enter the name of the executable (just the name is sufficient, do not enter the full path).
    3. Based on the the host names in the host file (see step 9), DDT will scan the specified hosts for processes with the given name and attempt to attach to them. If you have submitted a job to the queue, obtain the host list from (for example) checkjob <job number>.

Note:

You can download the DDT User Guide from the Allinea web site.

Back to top

Bundling Serial Jobs

How to submit multiple serial jobs with a single script

Occasionally, a group of serial jobs need to be run on TSCC. Rather than submit each job individually, they can be grouped together and submitted using a single batch script procedure such as the one described below.

Overview

Although it's preferable to run parallel codes whenever possible, sometimes that is not cost-effective, or the tasks are simply not parallelizable. In that case, using a procedure like this can save time and effort by organizing multiple serial jobs into a single input file and submitting them all in one step.

The code for this process is given below in a very simple example that uses basic shell commands as the serial tasks. Your complex serial tasks can easily be substituted for those commands and submitted using a modified version of these scripts and run from your own home directory.

Note that the /home filesystem on TSCC uses autofs. Under autofs, filesystems are not always visible to the ls command. If you cd to the/home/beta directory, for example, it will get mounted and become accessible. You can also reference it explicitly, e.g. ls /home/beta, to verify its availability. Autofs is used to minimize the number of mounts visible to active nodes. All users have their own filesystem for their home directory.

The components used in this example include:

  • submit.qsub (batch script to run to submit the job)
  • my_script.pl (perl script invoked by batch job)
  • jobs-list (input to perl script with names of serial jobs)
  • getid (executable to obtain the processor number, called by perl script)

Back to top

Example Batch File

The following is an example script that can be modified to suit users with similar needs. This file is named submit.qsub.

#!/bin/sh 
# 
#PBS -q hotel 
#PBS -m e 
#PBS -o outfile 
#PBS -e errfile 
#PBS -V
  
################################################################### 
### Update the below variables with correct values  

### Name your job here again 
#PBS -N jobname  

### Put your node count and time here 
#PBS -l nodes=1:np=5 
#PBS -l walltime=00:10:00  

### Put your notification E-mail ID here 
#PBS -M username@some.domain  

### Set this to the working directory 
cd /home/beta/scripts/bundling  

####################################################################  

## Run my parallel job 
/opt/openmpi_pgimx/bin/mpirun -machinefile $PBS_NODEFILE -np 5 \ 
./my_script.pl jobs-list

Back to top

Example Script and Input Files

The above batch script refers to this script file, named my_script.pl.

#!/usr/bin/perl 
# 
# This script executes a command from a list of files, 
# based on the current MPI id. 
# 
# Last modified: Mar/11/2005 
#  

# call getid to get the MPI id number  

($myid,$numprocs) = split(/\s+/,`./getid`); 
$file_id = $myid; 
$file_id++;   

# open file and execute appropriate command  
$file_to_use = $ARGV[0]; 
open (INPUT_FILE, $file_to_use) or &showhelp;  

for ($i = 1; $i <= $file_id; $i++) { 
    $buf = <INPUT_FILE>; 
}  

system("$buf");  

close INPUT_FILE;   

sub showhelp { 
        print "\nUsage: mpiscript.pl <filename>\n\n"; 
        print "<filename> should contain a list of executables,"; 
        print " one-per-line, including the path.\n\n"; 
}
</filename></filename>

The batch script refers to this input file, named jobs-list.

hostname; date 
hostname; ls 
hostname; uptime 
uptime 
uptime > line-5

Back to top

Sample Output

Running the above script writes output like this to the file <outfile>. Notice that the output lines are non-sequential and may be written to the same file.

12:20:53 up 3 days, 5:41, 0 users, load average: 0.92, 1.00, 0.99 
compute-0-51.local 
compute-0-51.local 
Wed Aug 19 12:20:53 PDT 2009 
12:20:53 up 3 days, 5:41, 0 users, load average: 0.92, 1.00, 0.99 
compute-0-51.local 
getid getid.c jobs-list line-5 my_script.pl submit.qsub

Line 5 in the above script writes output like this to the file <line-5>. This output does not appear in the shared output file.

12:20:53 up 3 days, 5:41, 0 users, load average: 0.92, 1.00, 0.99

Summary and Potential Other Uses

A modification of this procedure is available from TSCC User Support (member-only list) that matches the number of scripts to the number of processors, when more scripts are being run than processors are available.

It should also be possible to modify this script to run parallel jobs. Feel free to try it or ask support for help through the TSCC Discussion List.

Back to top