Skip to content

Getting Started on DataStar

If you do not already have an allocation on DataStar, please visit SDSC Allocations to establish an account.

UNIX Login

1. Log on to DataStar via Secure Shell (SSH) using the hostname dslogin.sdsc.edu. For more information on using SSH, see the ssh man pages.

ssh username@dslogin.sdsc.edu
OR
ssh -l username dslogin.sdsc.edu

2. When prompted, enter your UNIX password (originally given in your account packet). You may change your password at any time using https://passive.sdsc.edu. (Note: It is strongly recommended for first-time users to reset their passwords before logging in.)

If you receive an error message when attempting to log in that refers to the version of the protocol, you may be running an old version of SSH. Please see your local administrator to upgrade to SSH version 2.

IMPORTANT NOTE: Please do NOT run compute-intensive programs on the login node, dslogin.sdsc.edu. Also do not leave orphaned background processes. This node should be used only for the purpose of editing, compiling, and submitting your programs as well as transferring data. Submit batch scripts for production jobs (see Running: Batch Jobs - LoadLeveler). If you need interactive access (for example for the purpose of debugging your program) login to dspoe.sdsc.edu and run interactive jobs from there using poe or poe32 (see Running: Interactive). ** Serial jobs can be run by setting both the number of nodes and tasks per node to 1. **

SSH Key Authentication (optional)

If you log into more than one SDSC or TeraGrid machine and prefer a single password, you may want to set up an RSA key pair with SSH2 authentication. Computational grid users may also use GSI-Enabled SSH.


From a UNIX machine:

If you already have a key pair on your local machine, skip to step 3.

1. From your local machine, generate a key pair consisting of a private key (~/.ssh/id_rsa); and a public key (~/.ssh/id_rsa.pub).

ssh-keygen
OR
ssh-keygen -t rsa  (When prompted, press <Enter> to accept the default name (~/.ssh/id_rsa). If you are generating a second key pair, rename this file.)

2. When prompted, enter a new passphrase. This passphrase is your single point of protection in the ssh key authentication method.

Entering a passphrase is required by SDSC security policy. Do not skip this step.
SSH will then encrypt your private key file and save it onto your local machine. Do NOT make a copy of your private key to any third-party machine. To connect from a remote host, use ssh-add on the host.

3. Send a copy of your public key to SDSC to be added to DataStar. Repeat this step for every new system you would like added to your list of known hosts.

ssh -i ~/.ssh/id_rsa -l username dslogin.sdsc.edu
OR
Submit a message to SDSC Consulting. Attach or copy your public key into the body of the message along with your contact phone number. SDSC will add it to the list of authorized keys (/.ssh/authorized_keys) on your DataStar account.
Note: Make sure your permissions are set to:
drwx------ (700) or drwxr-xr-x (755) ~/.ssh
-rwx------ (700) ~/.ssh/id_rsa
-rwxr-xr-x (755) ~/.ssh/id_rsa.pub
-rwx------ (700) ~/.ssh/authorized_keys

4. Log on to DataStar using your SSH passphrase instead of your UNIX password. You may change your passphrase at any time using ssh-keygen -p or follow steps 1-3 to regenerate a new key pair.

For more information on any of these ssh commands, see their respective man pages. If ssh is failing, use ssh -v to find out why.


From a PC or Mac machine:

Please visit the OpenSSH page to download an SSH client.

PuTTY is a simple, free SSH client. Click here to download the latest version. (Note: PuTTY uses its own key format, so use PuTTYgen in the same package to generate your key pair and export an OpenSSH version of your public key to SDSC.)

WinSCP is a scp(1) program for Windows, with PuTTY integrated into it.

Back to Top

Moving Files to DataStar

There are several ways to transfer files to DataStar. From UNIX systems, secure copy (scp) is recommended. The following is an example of an scp from a local machine to DataStar:

scp original_file username@dslogin.sdsc.edu:/to_dir/copied_file

Using SCP from Windows

To use secure copy from Windows platforms, download a copy of WinSCP (freeware). WinSCP screenshots before and after login are shown below.

WinSCP Login Screen:

WinSCP Login Screenshot

WinSCP Directories (after successful login):

WinSCP Directories Screenshot

DataStar users can also move entire directory structures from one system to another via HPSS or SRB (see the archival storage section of this guide).

Transferring large files with bbftp

For transferring large data files, typically those bigger than 2GB in size, we have installed bbftp software on our DataStar and IA-64 Cluster machines. This uses multiple transfer streams and compression in order to speed up the transfer process, making transfers up to 10x faster than can be achieved using regular scp.

Installing

Linux/UNIX:
To use bbftp from a within a Linux/UNIX environment, please download the latest version of the bbftp-client source code from the bbftp downloads site. Installation instructions can be found here. (You may need the assistance of your local system administrator to complete this step.)

Windows:
To use bbftp from within a windows environment a client compiled with Cygwin is available (the file bbftp-client-cygwin-3.2.0.zip from http://doc.in2p3.fr/bbftp/download.html). In order to use it you need a working cygwin installation on your windows machine with ssh installed. (You may need the assistance of your local system administrator at this stage.)

Usage

Bbftp provides an FTP-like interface for transferring files. For example, in order to download the file output.dat in the directory /gpfs/ux12345/data/ from DataStar and put it in /home/username/output.dat (using 10 streams), one would execute the following command:

bbftp -s -e 'get /gpfs/ux12345/data/output.dat /home/username/output.dat' -u ux12345 -p 10 -V dslogin.sdsc.edu

In order to upload the file input.dat from your local machine to your gpfs directory on DataStar one would execute:

bbftp -s -e 'put input.dat/gpfs/ux12345/input.dat' -u ux12345 -p 10 -V dslogin.sdsc.edu

In both examples replace ux12345 with your SDSC username. More documentation can be found at the bbftp website, or by typing 'man bbftp' on the SDSC machines.

Back to Top

Porting MPI Programs to DataStar

If you have an existing MPI-based parallel application program already running on a distributed-memory computer, such as IBM SP3, porting to the DataStar should be straightforward. 

  • Copy your application file(s) to the DataStar local disk space - the $HOME or /gpfs/username/ directory associated with your user account. If this directory does not exist, you will have to create it yourself. Please note that the work areas are not backed up. Files for long-term storage should be on either the user's $HOME directory or on HPSS and/or SRB. For information on how to move file(s) to DataStar, see the previous section "Moving User Directories to DataStar".
  • Source code should be recompiled for the DataStar's POWER4 architecture and relinked with the appropriate libraries, e.g., MPI. All compiler switches should be set for POWER4 values ( mpcc/mpCC/mpxlf/mpxlf90 are all wrappers which are set up for support of MPI-based parallel applications). Please see Compiling: Message Passing Programs for a list of compilers and suggested options.

If your application used numerical libraries, such as LAPACK, or the IBM ESSL and/or MASS numerical libraries, you must link these explicitly:

mpcc/mpCC/mpxlf/mpxlf90_r options source_file -L/usr/local/apps/MASS -lmass -lessl

If your application has a single-CPU version, it should be compiled and linked to create a new binary. Please see Compiling: Serial Programs for a list of compilers and suggested options.

  • Check program output to ensure correctness 
  • Tune Single-CPU performance. The following references might be useful: Power4 Introduction and Tuning Guide, SG24-5155-00, and a tutorial (Powerpoint) on porting to Power4 processors.
  • There are examples of the use of some compiler options on dslogin.sdsc.edu at /usr/local/apps/examples.

Back to Top

System Configuration

DataStar consists of nodes of two types: 8-way p655+ and 32-way p690+ nodes. The node distribution for the batch queues is tabulated below.

The use of the 8-way p655 nodes is exclusive: users will have exclusive access to any requested 8-way node(s) during job execution, and they will be charged for using 8 cpus per node whether they use 8 cpus per node or not (see Accounting).

The use of the 32-way p690 nodes is shared among users, but subject to CPU and memory constraints specified in the batch script. Performance of your job on these nodes may vary depending on the other users' jobs competition over memory and I/O bandwidth. The nodes are suitable for both shared-memory (e.g.OpenMP or Pthreads) and message-passing (e.g. MPI) programming models, as well as the mixture of the two.

Node Type # Nodes Memory per Node CPUs per Node CPU Speed CPU Peak Performance
655+ 169 16GB 8 1.5 GHz 6.0 GFlops
655+ 96 32GB 8 1.7 GHz 6.8 GFlops
690+ 4 128GB 32 1.7 GHz 6.8 GFlops
690+ 2 256GB 32 1.7 GHz 6.8 GFlops
690+ 2 64GB 32 1.3 GHz 5.2 GFlops


Acceptable Use of DataStar Nodes

DataStar has one login node, dslogin.sdsc.edu (p690, 32-way, 64GB). This node should be used only for auxiliary tasks such as transferring and editing your files; compiling and submitting batch jobs. You may launch batch jobs from the dslogin node to the many p655 and p690 batch nodes (see Running Batch Jobs).

DataStar also has an interactive cluster with p655, 8-way, 16GB nodes. This cluster consists of one interactive login node, dspoe.sdsc.edu , two interactive nodes and four express nodes. Interactive and express runs should be done only by logging into the interactive login node, dspoe (see Running: Interactive Jobs).

For direct interactive runs like debugging, visualization, and post processing data please use dsdirect.sdsc.edu which has 128 GB of memory and 32 CPUs. For jobs requiring large memory please verify that dsdirect.sdsc.edu has adequate free memory before running jobs as the node is a common resource for all users.

Node Name Node Type Node Function Acceptable Tasks Max Memory Usage Max CPU Usage
dslogin p690, 64GB login node
  • Editing source files.
  • Submitting jobs to the batch queues.
  • Simple 'make' and compile jobs.
  • Archiving using hsi.
  • Data transfer using scp/sftp/gridftp.

  • NO ssh/rlogin into other nodes.
  • NO resource-intensive jobs.
  • < 1% 15 minutes
    dspoe p655, 16 GB interactive login node
  • Debugging, visualization, etc.
  • Submitting jobs to the interactive queues.
  • Large 'make' jobs such as packages and compiler verification suites.
  • < 5% 2 hours
    dsdirect 256 GB interactive
  • debugging
  • visualization
  • post processing data
  •    

    ATTENTION: The dslogin node limits compute time and memory usage per user. Please do not run compute or memory-intensive programs on the dslogin node. Running such processes may be killed without warning when limits are exceeded.

    Tasks not allowed on ANY node:

    • NO Background detached jobs (i.e., gvim)
    • NO Touch scripts

    Note:

    • Home directory quota - 1GB

    How the Nodes are Connected

    The nodes are connected by the Federation interconnect. Each node is directly connected to the GPFS (IBM's parallel file system) through a Fibre Channel link. The observed GPFS I/O performance is about 2.1 GB/s for reads and 1 GB/s for writes.

    Latency and Bandwidth Comparison
      MPI Latencies (usec) Bandwidth (MB/s)
    Intra-node 2.2 2404
    Inter-node 6.2 1428

    The Power4 processors are super-scalar (capable of simultaneous execution of multiple instructions) pipelined 64-bit RISC chips with two Floating-point Units, 2 Fixed Point (Integer) Units, Branch Execution Unit, and a Cinditional Register Unit. These processors feature out-of-order execution capabilities. They are capable of executing up to 8 instructions per clock cycle and up to four floating point operations (two fused multiply-adds) per cycle. Each Power4 CPU has a two-way L1 (32 KB) cache, a L2 (0.75 MB) cache which is four-way set associative. There is also an 8-way L3 cache on each node (16 MB per processor). The following references might be useful: IBM's Power4 Processor Introduction and Tuning Guide, SG24-5155-00, Porting to Power4 processors (PowerPoint), and Performance Analysis (PowerPoint).

    To see how DataStar is connected to other SDSC Resources, click here.

    Back to Top

    Software Environment

    The operating system (OS) on DataStar is currently AIX 5.2, which is IBM's proprietary 64-bit version of the UNIX OS.

    IBM's Parallel Environment (PE) provides the software "glue" that allows users to develop applications which utilize the machine hardware and operating system as a shared resource for effective parallel computing. PE includes such useful tools as the program debugger, program profiler and C/C++ and Fortran90/77 compilers and numerical libraries optimized for the DataStar architecture.

    Back to Top

    File Storage: Disk

    WARNING: It is your responsibility to back up critical data! Please maintain your own copy of important data stored on SDSC file systems.

    Each user has several areas of disk space for storing files for immediate use on the IBM. These areas may have size or time limits for how long disk files may stay resident. To request increased disk capabilities (which may mean both larger quotas in home directories and periods without purging in /gpfs) on DataStar please send justifications to consult@sdsc.edu.

    Filesystem Characteristics
    /users /home directories are mounted on a NFS (network file system). They are best used for storing source files and scripts. They should not be used for storing large files or ouput from batch programs. 1 GB quota per user will be enforced soon. For your convenience, the environment variable $HOME is defined in your .login and .cshrc files. Regular backups are also performed.
    /gpfs /gpfs is the GPFS area (IBM's general parallel file system). It is recommended for use with most jobs, especially where large amounts of data are produced or read, where I/O performance is important, or where parallel I/O is performed. To use GPFS, create a subdirectory with your username, such as /gpfs/ux344455. This space should only be used for staging your data for calculations. Such data is considered temporary and is NOT backed up. Files untouched for more than 4 days will be purged. Users are responsible for moving their important long-term data to archival storage.

    If you plan to generate multiple TB of data in a short period of time or you have particular storage needs, contact SDSC Consulting so that special arrangements can be made.

    /scratch /scratch is a local file system on each compute node of approximately 64 GB. It provides the fastest I/O and is best used for storing small data within the timeframe of the job's execution. This space is cleaned up after each job, therefore, it should not be used for anything other than temporary storage. Please do not use /scratch from the login nodes (see System Configuration).

    Note: The /home directory is mainly for account configuration files; the /gpfs directory is for working files; and safeguard important files from being purged by moving them to archival storage. Do NOT store files in /tmp.

    WARNING: It is your responsibility to archive important files to prevent data loss due to SDSC's automated file purging. Purge times are based on date of last access. After your files are purged, SDSC has no way of retrieving them.

    Back to Top

    File Storage: Archival

    WARNING: It is your responsibility to back up critical data! Please maintain your own copy of important data stored in SDSC archival systems. Because of the enormous amount of data involved, SDSC does not back up files in archival storage. Although the SDSC storage systems are very reliable, data can be lost or damaged due to media failures, system software bugs, hardware failures, and user mistakes.

    Archival storage for SDSC production systems is provided through SDSC's High Performance Storage System (HPSS). SDSC's Storage Resource Broker (SRB), a data management tool, may also be used to store large data sets across distributed, heterogeneous storage systems.

    The recommended interface to HPSS is HSI. HSI supports wildcards for local and HPSS pathname pattern matching and provides recursion for many commands, including the ability to store, retrieve, and list entire directory tress, or change permissions on entire trees. It may be used interactively or in batch mode and may be included in UNIX pipes. HSI is also especially useful for SDSC users with accounts on multiple platforms, as it provides an interface to the SDSC HPSS system from most SDSC machines.

    The following example illustrates a directory move from a local machine to DataStar via HPSS:

    1. create a copy of the local directory with tar (the time to do this depends upon the sizes and number of files, etc.):

      tar -cf your_tar_file

    2. compress tar file with gzip (this step may not be necessary if your tar file is small). This creates a tar file with the name your_tar_file.gz:

      gzip your_tar_file

    3. access HPSS from local machine with HSI (client binaries are available for download at the HSI site):

      hsi

    4. store the compressed file in HPSS:

      put your_tar_file.gz

    5. login to DataStar and access HPSS with HSI:
    6. ssh dslogin.sdsc.edu
      hsi

    7. download compressed tar file from HPSS:
    8. get your_tar_file.gz

    9. uncompress tar file:
    10. gunzip your_tar_file.gz

    11. move tar file to the desired location on DataStar and untar:

      tar -xf your_tar_file

    For more details on using HPSS, refer to the HPSS User Guide. More information about SRB is available at the SRB Home Page. In addition, Lawrence Livermore National Laboratory has developed the HPSS Tape Archiver (HTAR). HTAR is a file-bundling and storage utility designed to efficiently transfer a very large number of (related) files in the form of a manageable archive or library such as HPSS. For more information, please visit the HTAR Reference Manual.

    Back to Top

    Common User Environment

    SDSC provides actively managed .cshrc, .login, and .profile configuration files. Both files execute commands stored in global configuration files. SDSC staff members monitor and update these global configuration files to provide a convenient working environment. To copy the default .login, .cshrc, and .profile files, issue the following command:

    /usr/local/apps/shellrc/bin/COPYDEFAULTS

    Add your favorite configuration commands at the end of your .cshrc, .login, or .profile files. Alternatively, add configuration commands to files named .cshrc_user, .login_user, or .profile_user. The latter method simplifies restoring or updating your environment for the latest configuration of the system, since your personalized configuration commands will not be changed.

    Back to Top

    Accounting

    How Accounts Are Charged

    The following is the algorithm for charging Service Units (SUs) from a user's allocation; where P is priority depending on the job class (normal = 1; high = 2; express = 1.8). For more information on the job class/batch queues available, see the section Running Batch Jobs: Batch Queues on DataStar.

    p655 (8-way) nodes are charged as follows:

    SUs = P x Wallclock_Hours x Num_Nodes x 8

    p690 (32-way) nodes are charged based on the memory and number of CPUs requested by a job:

    SUs = P x 32 x Wallclock_Hours x MAX( Np /32, M /MMax)

    where wall clock time is in hours, MAX is the maximum of the two numbers, Np is the number of processors used by the job, and M is the memory requested by the job. MMax is the maximum memory available on the node.

    Interactive jobs on Datastar are charged at the same rate as batch jobs, based on CPU_Usage instead of Wallclock_Hours for each job and P=1. Users with more than one account may specify which account they want charged by setting an environment variable:

    setenv POE_ACCOUNT_NO AAA123
    OR
    export POE_ACCOUNT_NO=AAA123

    As a reminder, the interactive nodes should only be used for the purposes of testing and debugging (see Running Interactive Jobs).

    On dsdirect.sdsc.edu users will be charged based on the amount of memory and the number of CPUs their jobs use. The charging is done as follows:

    Number of SUs charged = P x 32 x WH x MAX(Np/32, M/Mmax)

    where P = 2, WH is the wall clock time in hours, MAX is the maximum of the two numbers, Np is the number of processors used by the job, and M is the memory used by the job. Mmax is the maximum memory available on the node and is set to 256GB.

    How To Check Your Remaining Allocation

    Users can check their remaining allocation using the reslist command (see the example below). Complete information on the usage and options of reslist may be found by typing reslist --help on dslogin.sdsc.edu.

    ds001 % reslist
    Querying database, this may take several seconds ...
    SSTR: All parts of DataStar
                                        SU Hours   SU Hours
      Name       UID  ACID  ACC  PCTG  ALLOCATED       USED  USER
      jdoe     88888   300    U   100     500000    5000.00  DOE, JOHN
    use300             300                500000  450000.00

    To determine the allocation usage for a single user:

    % reslist -u username

    To determine the allocation usage for all users under a given account:

    % reslist -a grp000

    To determine the allocation usage for jobs run within a particular time period:

    % reslist -j -u username -a grp000 --begindate=mm-dd-yyyy --enddate=mm-dd-yyyy

    Reslist will report account information for several CPUs, as shown below.

    CPU System Characteristics
    SSTR SDSC IBM DataStar Power4 All types of nodes, running in any job class (queue).
    STG690 SDSC TG IBM P690 This allocation is for Teragrid nodes, and may only be used for running batch jobs in TG* job classes (queues).
    SNP690 SDSC IBM P690 This allocation is for SDSC p690 nodes, and may only be used for running batch jobs under classes normal32, normalL and similarly named express/high classes (queues).
    SNP655 SDSC P655 This allocation is for SDSC p655 nodes, and may only be used for running batch jobs under classes normal, high and express.
    SRT690 SDSC Roaming TG P690 This is similar to the STG690 cpu, in that it is for Teragrid users. However, this is roaming rather than a Datastar specific allocation. You may use the TG* job classes (queues).
    SSTRNP SDSC DataStar Nodes This is for projects that have a single allocation on all the SDSC nodes. You may use any of the job classes for the other SDSC allocations (SNP690 and SNP655).

    How To Control Allocations (for PIs only)

    Principle investigators (PIs) can control access to their allocations using the resalloc command. Resalloc allows a PI to set a specified amount of allocation for each member of their group (see the example below). Complete information on the usage and options of resalloc may be found by typing resalloc --help on dslogin.sdsc.edu.

    ds001% resalloc --username=jdoe --account=use300 --cpu=SSTR --percentage=10
    Updating database, this may take several seconds ...
                  
    User: jdoe
    Account: use300
    Platform: SSTR
    Percentage: 10%
                
    Update Successful.


    How To Add Users to an Account (for PIs only)

    To request additional login names, the principal investigator add users to his or her allocation online. For non-TeraGrid allocations, add users via the User Authorization Form on this site. For Teragrid allocations, use the TeraGrid User Portal.

    The additional users will receive a packet via conventional mail in about one week. Please also ask additional users to review Using Your Account and Managing Your Account for usage guidelines and security.

    Back to Top


    Did You Get
    What You
    Wanted?
    Yes No
    Comments