Skip to content

Master Control Program (MCP)

MCP is a command line utility that provides automatic resource selection for running a single parallel job on high performance computing resources. MCP uses directives provided by the user in batch submission scripts to submit to the queues of multiple resources. As soon as the job starts to run on one of the resources, it removes the jobs from all other resources' queues.

Automatic resource selection scheduling services are a class of metascheduler. Ultimately, their objective is to run jobs faster on distributed high performance compute resources by finding the resource with the soonest availability.

MCP requires the application to be compiled on each machine (if it does not already exist there), and the input files to be staged on the remote clusters. The MCP submission will be initiated from only one of the machines. In order for the batch job submission script to run correctly, it needs to include #MCP directives. MCP can be used by itself, but the user must manually create a job script for each machine. Fullauto is a utility that simplifies the usage of MCP by storing frequently used settings and automatically generating job files. The workflow below outlines the use of MCP with manual scripts. (See the Fullauto page for the MCP workflow with Fullauto.)

Locations

Currently MCP is installed on:

  • IA-64 Cluster at NCSA (Mercury): /usr/local/MCP/mcp/mcp.py
  • IA-64 Cluster at SDSC: /usr/local/apps/mcp/mcp.py

MCP Workflow

The following workflow presumes the use of two resources.

  1. User runs grid-proxy-init or myproxy-get-delegation to establish grid credential.

    grid-proxy-init

    OR

    myproxy-get-delegation *
    * Use myproxy-get-delegation to establish your grid credential using TeraGrid Single Sign-On with your TeraGrid-Wide (TeraGrid User Portal) password.
  2. User constructs a set of appropriate job files, one for each resource. (See example job files.)

    vi jobfile_1
    vi jobfile_2

  3. User submits the job files to MCP with job files as the input.

    ./mcp.py [--debug] <submit_script1> [<submit_script2>]
    ./mcp.py --query=<MCP job info file>
    ./mcp.py --resume=<MCP job info file>

  4. MCP scans each of the job submission scripts for lines that specify the resource to which the script is to be submitted. Required lines are:

    #MCP submit_host <head node for the remote cluster>
    #MCP username <local username on the remote cluster>
    #MCP scratch_dir <scratch directory on the remote cluster>

  5. MCP submits the scripts to the queues of the specified resources and proceeds to continuously monitor the status of these jobs. As soon as the job begins to run on one of the resources, it is removed from queues of the other resources.

MCP directives, examples and explanations

Shortcut to built-in configurations

Specifies use of built-in configs for certain resource managers:

#MCP qtype [pbs|loadleveler|globusws|cobalt]
#MCP qtype pbs

Mandatory directives

Specifies on which machine the job is submitted:

#MCP submit_host [hostname]
#MCP submit_host tg-login1.sdsc.edu

Specifies remote username:

#MCP username [remote username]
#MCP username your_username

Specifies where MCP may stage files:

#MCP scratch_dir [remote scratch directory]
#MCP scratch_dir /gpfs/your_username/test/mcp

Directives mandatory for qtype globusws

Specifies whether the job submit command is run on the same machine as mcp.py (as with Globus GRAM jobs) or on the submit_host:

#MCP submit_mode [local|remote]
#MCP submit_mode local

Specifies how to contact Globus web service:

#MCP globus_factory [globus factory string]
#MCP globus_factory https://tg-login1.sdsc.teragrid
[cont'd].org:8443/wsrf/services/ManagedJobFactoryService

Specifies type of factory to contact through Globus:

#MCP globus_factory_type [Globus factory type]
#MCP globus_factory_type PBS

If #MCP qtype is specified, the following directives are optional

Specifies job submit command:

#MCP submit_command [command to submit job]
#MCP submit_command qsub

Allows the user to modify parsing of the returned text from the submit command:

#MCP submit_return_pattern [Python regular expression]
#MCP submit_return_pattern ^(?P<job_id>\d+).[-.\w]+\s*$

Specifies job query command:

#MCP queue_line_command [command to query job]
#MCP queue_line_command qstat

Allows the user to modify parsing of the returned text from the query command:

#MCP queue_line_pattern [Python regular expression]
#MCP queue_line_pattern ^\d+\.\S*\s+\S+\s+\S+\s+
[cont'd](\d\d:\d\d:\d\d|\d)*\s+(?P<state>\w)\s+\S+

Specifies job cancel command:

#MCP kill_command [command to cancel job]
#MCP kill_command qdel

Notes

  • Unprompted remote commands provide for smoother running of MCP than prompted commands. In order to use unprompted commands, user can either set up passwordless ssh with a ssh-agent, or use grid-proxy-init and gsissh (where available). To specify the path to ssh and scp that do not require passwords, set the environment variables MCPSSH and MCPSCP.

    export MCPSSH=/usr/bin/gsissh
    export MCPSCP=/usr/bin/gsiscp
  • Since there may be a long delay before job start, use the screen utility to run MCP session in the background (See example commands).
  • Qtype function specifies the commands to be used for job submission, monitoring and cancellation. Recognized types are pbs, loadleveler, cobalt and globusws. User can create new qtypes and write them to the MCPResource.py file (See Scenarios 1 and 2). Also, interface information can be specified manually in the job submission script (See Scenario 3)
    Other resource manager commands to be used by MCP are specified in the MCPResource.py file.


Did You Get
What You
Wanted?
Yes No
Comments