Running MATLAB Jobs

A Tutorial Guide Matlab access for Comet and Gordon

Both Comet and Gordon support MATLAB, a development environment from The MathWorks. Here you will find instructions and examples for running matlab jobs.

Step 1: Check that you are in the matlab group by running the "groups" command.

For example:

$ groups use124 matlab-users

If you are not in the matlab group, please send an email to help@xsede.org and ask to be added.

Step 2: Run Matlab

Matlab must be run on compute nodes of Comet or Gordon. There are three options as described below. Please do *NOT* run on the login nodes.

I. Interactive job with non-GUI option

(a) Obtain an interactive compute node.
COMET: (The following gets you one core on one node in the debug partition)
$srun --partition=debug --pty --nodes=1 --ntasks-per-node=1 -t 00:30:00 --wait=0 --export=ALL /bin/bash
GORDON: (The following get you one node in the normal queue)

$qsub -I -lnodes=1:ppn=16:native,walltime=00:30:00  -q normal

(b) Once the command in (a) runs you will be placed on a compute node. You can load the matlab module and run matlab as follows:

module load matlab
matlab -nodisplay

Sample session Comet:

[user@comet-ln3]srun --partition=debug --pty --nodes=1 --ntasks-per-node=1 -t 00:30:00 --wait=0 --export=ALL /bin/bash
[user@comet-14-01 ~]$ module load matlab
[user@comet-14-01 ~]$ matlab -nodisplay                            
< M A T L A B (R) >                  
Copyright 1984-2016 The MathWorks, Inc.                   
R2016a (9.0.0.370719) 64-bit (glnxa64)                               
April 13, 2016

To get started, type one of these: helpwin, helpdesk, or demo. For product information, visit www.mathworks.com.        

Academic License

>> A = [1 3 0; 2 4 -1; 4 9 -1]

A =
     1     3     0
     2     4    -1
     4     9    -1
>>
>>
>> B=A'
B =
     1     2     4
     3     4     9
     0    -1    -1
>> exit
[user@comet-14-01 ~]$ exit
exit
[user@comet-ln3 ~]$

II. Interactive job with GUI option (a) Obtain an interactive compute node and ask for all 24 cores on it:

COMET: (The following gets you one core on one node in the debug partition)
$srun --partition=debug --pty --nodes=1 --ntasks-per-node=24 -t 00:30:00 --wait=0 --export=ALL /bin/bash
GORDON:  (The following get you one node in the normal queue)
$qsub -I -lnodes=1:ppn=16:native, walltime=00:30:00 -q normal

(b) Once the command in (a) runs you will be placed on a compute node. Keep this window open and then from a different terminal window directly ssh to the compute node. For example if you step (a) put you on comet-29-01, you can do:

$ssh -X username@comet-29-01.sdsc.edu

At this point you can do the same steps as case I to load the matlab module and then launch matlab without any options:

module load
matlab matlab

This will launch the GUI window.

III. Run Matlab via a batch script

COMET: We have an example SLURM batch script in /share/apps/examples/MATLAB. The script is requesting 1 core in the "shared" partition for 1 hour and 30 minutes.
The script(called matlab.sb) is as follows:
#!/bin/bash
#SBATCH --job-name="matlab"
#SBATCH --output="matlab.%j.%N.out"
#SBATCH --partition=shared
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --export=ALL
#SBATCH -t 01:30:00
#This job runs with 2 nodes, 24 cores per node for a total of 48 cores.
#ibrun in verbose mode will give binding detail
module load matlab
matlab -nodisplay -nosplash < UsingTimeSeriestoPredictEquityReturnExample.m

 The script matlab.sb  can be submitted as follows:

sbatch matlab.sb

(Assumes that UsingTimeSeriestoPredictEquityReturnExample.m  is in your submit directory - can be copied from the examples directory)

Once the job runs (use the squeue command to check on its status), the output file will be created in the submit directory.

Running Parallel MATLAB Jobs

A Tutorial Guide for Comet and Gordon

Both Comet and Gordon support Parallel MATLAB, a development environment from The MathWorks. Here you will find instructions and examples for running jobs with the MATLAB Parallel Computing Toolbox on a desktop and submitting them to the MATLAB Distributed Computing Server (MDCS).

Note: MATLAB configurations are very similar on both Comet and Gordon. Where this documentation refers to Gordon, you may substitute Comet without loss of accuracy.

Important: Use of MATLAB on Comet and Gordon is limited to users from degree-granting educational institutions. To use MATLAB on Comet or Gordon, request to have your account added to the matlab UNIX group by sending an email, submitting an XSEDE Help Desk ticket.

How to Use MATLAB on Comet and Gordon

MATLAB Architecture

Parallel MATLAB consists of two parts: the PCT and the MDCS.

Features of the Parallel Computing Toolbox

The PCT is a module that runs on the MATLAB client. It contains a number of useful capabilities, including:

  1. Parallel for-Loops (parfor)
  2. Distributed Arrays (arrays spread among several worker processes)
  3. spmd blocks (single program, multiple data) which execute code in a manner similar to mpi runs; MATLAB code placed within an spmd block executes simultaneously on the pool of MATLAB processes that the user has allocated; each process may be identified by a labindex variable
  4. The ability to define several independent tasks to run simultaneously within a single, embarrassingly parallel MATLAB job
  5. Parallel integration of many toolboxes, e.g. bioinformatics, genetic algorithms and optimization
  6. Many general parallel functions

Client-Server Environment

Using the MDCS allows users to run multiprocessor jobs on Comet and Gordon via the batch queue system. The MDCS may be accessed from a user desktop with the PCT installed.

The PCT will automatically submit jobs to the MDCS (see below for details of this procedure).

The versions of MATLAB currently installed on SDSC HPC systems are:

ResourceVersionLocation
Comet 2015a /opt/matlab/2015a
Gordon 2014b /opt/matlab/2014b
TSCC 2015a /opt/matlab/2015a

Other versions may be available on some platforms. Contact SDSC Support with questions.

NOTE: The MATLAB version on the desktop must match the MDCS version.

System Requirements

To use the desktop PCT with Comet or Gordon you must have:

  • Matlab version 2013b installed
  • An ssh private/public key pair generated on the desktop with the public key installed on Gordon to enable passwordless connections

Linux and Mac OS X have built in key generating programs as part of their default system environments, but Windows does not. One option for Windows users is to download PuTTY and use it to generate the key pairs. See the section Configuring Secure Shell on a Desktop for Use with MATLAB Parallel Computing Toolbox, which describes how to generate key pairs on your desktop and install them on Gordon.

Usage Examples

See some examples of MATLAB usage to better understand the process.

Configuring Secure Shell on a Desktop for Use with MATLAB Parallel Computing Toolbox

MATLAB has unique setup requirements for users with a Parallel Computing Toolbox on their Windows desktop, compared with other desktop platforms. A secure shell is required to access the Comet or Gordon MATLAB server, and this must be installed if not already available. For Linux and Mac OS X, the default system shell will suffice.

When using the Comet- or Gordon-based toolbox and client, this desktop configuration is not required.

Configuring MATLAB for Use with a Desktop Client

To access the Distributed Computing server on Comet or Gordon from a desktop system (in other words, to use the Parallel Computing Toolbox with a client not installed on Comet or Gordon), you must have a secure shell installed on the desktop. The setup process is different for Windows and Linux/Mac <#unix>.

Windows Procedure

  1. Download and install PuTTY for Windows. You can download it from the PuTTY Download Page.
  2. Set the Windows PATH environment variable to the directory where the PuTTY executables are installed:
    1. Using the secondary mouse button, click the My Computer icon on the Desktop or in Windows Explorer
    2. Select Properties from the Context Menu; this brings up the System Properties dialog
    3. Click the Advanced System Settings tab for Vista and Windows 7, or Windows 8 or the Advanced tab for XP
    4. Click the Environment Variables button
    5. In the User Variables field, click on PATH and select Edit
    6. Add the PuTTY path to the existing path information
    image of Windows environment variables dialog
  3. Generate an ssh key with puttygen
    1. Click the Start button, then Click the Run button
    2. In the Run window, type puttygen and click OK
    3. Click on Generate and move the cursor around the Key field
    4. Click on the Conversions menu item and select Export OpenSSH Key
    5. When prompted if you want to save it without a password, click Yes
    6. This is the private key file that is referred to in the example run
    7. Copy the public key to your .ssh directory on Comet or Gordon (this is the same file system for both machines)
    8. On Comet or Gordon, convert the public key to an openssh key by running:
      ssh-keygen -i -f public_key.in > public_key.out in your .ssh directory
    9. Add the converted public key to your ~/.ssh/authorized_keys file
    image of keygen dialog 
    image of putty configuration session dialog
  4. Create a PuTTY profile for Comet or Gordon
    1. Click the Start button, then Click the Run button
    2. In the Run window, type putty and click OK
    3. Enter <login>@comet.sdsc.edu or <login>@gordon.sdsc.edu in the Host Name field where <login> is your Comet or Gordon login name
    4. Expand the Category tree on the left by clicking on the plus sign (+) next to SSH
    5. Click on Auth
    6. Click the Browse button and select the private key that you created
    7. Click on Session and enter <login>@comet.sdsc.edu or <login>@gordon-login.sdsc.edu in the Saved Sessions field
    8. Click Save
    image of putty configuration auth dialog

Linux and Mac OS X Procedure

UNIX systems (Linux, Mac OS X, etc.) have native implementations of ssh and scp, so when using a desktop MATLAB client, the only requirements are:

  • The public key from the MATLAB client machine must be installed in your Comet or Gordon account's ~/.ssh/authorized_keys file
  • You must have a public/private keypair established on Comet or Gordon
    • You are prompted for this keypair the first time you log onto Comet or Gordon
    • It must not require a password
  • The line in the public key file on Comet or Gordon (id_rsa.pub) must also be present in the authorized_keys file.

If you do not have an ssh keypair on your desktop, you can generate one as follows:

  1. From the Mac/Linux command line, run ssh-keygen -t rsa
  2. Press the Enter (Return) key at the passphrase prompt (creating a passwordless keypair)
  3. This will create a file named id_rsa.pub in your .ssh directory
  4. Copy this file to Comet or Gordon and insert its contents into the file ~/.ssh/authorized_keys

Usage Example for MATLAB on a Desktop Submitting a Job to the Comet and Gordon MDCS

First create a parallel profile in MATLAB for the Comet or Gordon cluster. To do this:

  1. From the HOME tab in MATLAB select "Manage Cluster Profiles" from the "Parallel" pull down menu
  2. From the "Add" pulldown menu select "Custom"
  3. From the "Custom" pulldown menu select "Generic" ("GenericProfile1" will appear in the "Cluster Profile" list) 
  4. Select "GenericProfile1", and clicking with the right mouse button, select "Set as Default"
    MATLAB Profile Manager image
  5. Using the scroll bar, scroll down to "SUBMIT FUNCTIONS". In the field to the right of "Function called when submitting independent jobs", enter "independentSubmitFcn"
  6. In the field to the right of "Function called when submitting communicating jobs", enter "communicatingSubmitFcn"
    MATLAB Profile Manager image
  7. Using the scroll bar, scroll down to "CLUSTER ENVIRONMENT". Change "Use Default" to "unix" from the pull down menu
  8. Change the value "Job storage location is accessible from client and cluster nodes" from "Use default" to "false"
    MATLAB Profile Manager image

Next, download the file archive which contains these files:

communicatingJobWrapper.sh
communicatingSubmitFcn.m
createSubmitScript.m
deleteJobFcn.m
extractJobId.m
getJobStateFcn.m
getRemoteConnection.m
getSubmitString.m
independentJobWrapper.sh
independentSubmitFcn.m

Copy these files to the toolbox/local directory of your local MATLAB installation. These are modified versions of the MATLAB files that come with the MATLAB release. These files allow you to specify several job parameters.

If you have used MDCS before on TSCC, Gordon or Trestles and are planning to run it on Comet, please update the files in your MATLAB installation under toolbox/local by downloading the new archive linked above. The new files also work with both TORQUE and SLURM.

Using these modified files, you may also set:

  • number of processors per node
  • account name
  • queue name
  • wall clock time

Other parameters that need to be set include

  • name of the remote cluster (comet.sdsc.edu or gordon.sdsc.edu)
  • directory where temporary data will be stored on Comet or Gordon
  • directory where data will be stored on the desktop
  • path to the desktop ssh private key file
  • path to MATLAB on Gordon

Here is an example MATLAB function that creates a cluster object:

function [ cluster ] = getCluster(username,account,clusterHost,ppn,queue,time,DataLocation,
               RemoteDataLocation,keyfile,ClusterMatlabRoot)
     cluster = parcluster('GenericProfile1');
     set(cluster,'HasSharedFilesystem',false);
     set(cluster,'JobStorageLocation',DataLocation);
     set(cluster,'OperatingSystem','unix');
     set(cluster,'ClusterMatlabRoot',ClusterMatlabRoot);
     set(cluster,'IndependentSubmitFcn',{@independentSubmitFcn,clusterHost,
               RemoteDataLocation,account,username,keyfile,time,queue});
    set(cluster,'CommunicatingSubmitFcn'{@communicatingSubmitFcn,clusterHost,
               RemoteDataLocation,account,username,keyfile,time,queue,ppn});
     set(cluster,'GetJobStateFcn',{@getJobStateFcn,username,keyfile});
     set(cluster,'DeleteJobFcn',{@deleteJobFcn,username,keyfile});
     

The following test function takes as its arguments all the parameters listed above, and returns a MATLAB cluster object that will be used to create an MDCS job.

Here is a MATLAB function used in a simple MDCS example job:

       processors=32
       clusterHost='gordon.sdsc.edu'
       ppn=16
       username='jpg'
       account='use300'
       queue='normal'
       time='01:00:00'
       DataLocation='/Users/jpg/Documents/MATLAB/data'
       RemoteDataLocation='/home/jpg/matlab/data'
       keyfile='/Users/jpg/.ssh/id_rsa'
       matlabRoot='/opt/matlab/2013b'
       cluster = getCluster(username,account,clusterHost,ppn,queue,time,DataLocation,
                 RemoteDataLocation,keyfile,matlabRoot);
       j = createCommunicatingJob(cluster);
       j.AttachedFiles={'testparfor2.m'};
       set(j,'NumWorkersRange',[1 processors]);
       set(j,'Name','Test');
       t = createTask(j,@testparfor2,1,{processors});
       submit(j);
       wait(j);
       pause(30);
       o=j.fetchOutputs;
       o{:}
       

In this example, a cluster object (cluster) is returned by getCluster(), which is passed to createCommunicatingJob(), which returns a job object. The files that are required on the cluster are defined, as well as the number of processors (32 in this case). A task is created that will call the function testparfor2() which has one output argument and one input argument with the value 32. The job is then submitted, and the output is stored in a MATLAB cell array.

Here is the function that is submitted to run:

    function a = testparfor2(N)
          a = zeros(N,1);
          parfor(i=1:N)
          feature getpid
          a(i) = ans
     end
     

In this simple example, an array of dimension N is initialized with zeros, and the process ID is written to each array element. Since in this case we have passed the number 32 to N and we have asked for 32 processors, we might expect to get 32 different processes to run the tasks.

However, when we examine our output array:

ans =

       27892
       24571
       27880
       24572
       24576
       24573
       27893
       24578
       24587
       27877
       24583
       24586
       27884
       27876
       27881
       24585
       24581
       24582
       27889
       27887
       27879
       24574
       27883
       27890
       24588
       27886
       24580
       24575
       27888
       27891
       27878
       27880
       

we can see that only 31 worker processes were used to perform the job. Two of the loop passes were performed by the same process (27880). The reason is that MATLAB uses one worker to run the serial code, allocating the remaining workers to perform the parallel functions. Since this only leaves 31 available workers, one of them must handle two loop iterations.