Sample session: su to a user in the CAT_LOCK_GROUP in the /catalina.config file. Go to the Catalina home directory: cd Start the scheduler: ./start.ksh You can also do ./set_config --key_value=state:running ./catalina_schedule_jobs --iterate and this will print to STDOUT, staying in foreground. Display the queue: ./show_q | more Show reservations: ./show_res | more Create a reservation (this will take a long time, since it needs to wait for a write-lock on the database) (the latest time considered for scheduling is set with SCHEDULING_WINDOW in catalina.config, in units of seconds from present time): ./set_res --user_list=testdemo,diegella --feature_list=CPUs8,MEM32 --start=00:00_06/25/2007 --end=01:00_06/25/2007 --resource_amount=32 --mode=real Note the res_id given for the reservation: ... reservation 996775412 created on 32 nodes with start_time 1001376000.0 for duration 3600.0 ... Check the reservation, using res_id: ./show_res --res=996775412 --readable --start --end --duration --job_rest --node_list Create a system reservation, for 4 days from now, lasting 8 hours on all configured nodes: ./create_system_res --offset=345600 --duration=28800 Create a system reservation, for 3:31AM Jan 15, 2002 TZ time, lasting 8 hours on all configured nodes: ./create_system_res --start=03:31_01/15/2002 --duration=28800 System reservation on 100 nodes: ./create_system_res --offset=345600 --duration=28800 --resource_amount=100 System reservation on nodes tf226i, tf227i, tf228i: ./create_system_res --offset=345600 --duration=28800 --node_list=tf226i,tf227i,tf228i Cancel a reservation: ./cancel_res --res_id= Create the Interactive Standing reservation (this may be too long to fit on a single command line): ./create_standing_res --start_spec='0 14 * * *' --depth=5 --duration=64800 --resource_amount=8 \ --job_restriction="if input_tuple[0]['job_class'] == 'interactive' \ and input_tuple[0]['wall_clock_limit'] <= 900 : result = 0" --mode=real Check interactive standing reservation ./show_standing_res --job_rest Create Standing res instances ./update_standing_reservations Check for standing reservation instances: ./show_res --purpose --readable --start --end --job_rest | grep interactive Delete a standing reservation: ./cancel_standing_res --res= Create a Shortpool Standing reservation (this may be too long to fit on a single command line): ./create_standing_res --start_spec='0 14 * * *' --depth=5 --duration=64800 --resource_amount=8 \ --job_restriction="if input_tuple[0]['job_class'] == 'interactive' \ and input_tuple[0]['wall_clock_limit'] <= 900 : result = 0" --mode=real \ --latency=3600 --overlap_running=1 Set system priority for a job: ./update_system_priority --job=tf173i.54638.0 --system_priority=10 To unset system priority for a job: ./update_system_priority --job=tf173i.54638.0 --system_priority=0 Show backfill windows for DEFAULT_JOB_CLASS (from catalina.config) class job: ./show_bf Show backfill windows for 'legion' user, 'normal' class, 'met200' group, 'met200' account, '6' QOS job: ./show_bf --username=legion --class=normal --group=met200 \ --account=met200 --qos=6 Show reservation overlaps: ./show_res --overlap --purpose --comment --readable --start --end Show all reservations on tf228i: ./show_res --nodegrep --purpose --comment --readable --start --end | grep tf228i Stop scheduling: ./stop.py For nonprivileged users (the allocation account for the user must be configured in catalina.config): Create a user reservation: ./user_set_res --account= --nodes= \ --duration= \ --earliest_start= \ --latest_end= \ --email= \ [--sharedmap=<1#type:node_shared#cpu:1+memory:1>] \ [--featurelist=] \ [--qoslist=] Cancel their own user reservation: ./user_cancel_res Bind the reservation to a job: ./user_bind_res Unbind the reservation from a job: ./user_unbind_res Tell a job to only run within a specific list of reservations: (For LoadLeveler) In the job command file, use the comment line #@ comment = Catalina_res_bind=; (For PBS) In the job command file, set the environment variable: Catalina_res_bind = There are several other options that may be used: --maxnodes= : use this instead of --nodes, if any number up to a maximum is acceptable. --featurelist= : use this if only nodes with certain features are desired. --sharedmap=<1#type:node_shared#cpu:1+memory:1> : use this if one CPU and one MB of memory are desired on a shared node. Commands show_q This displays information on jobs in the queue. The first section displays running jobs. These are ordered by TimeRemaining. The second section shows Eligible jobs. These are ordered by priority. The last section shows Ineligible jobs. Options: [--?] Display available options [--help] Display available options [--full] Display the full set of default info. [--job] Display job IDs [--state] Display job state [--class] Display job class [--limit] Display wall clock limit [--remaining] Display time remaining for Running job [--startt] Display start time for Running job [--resstartt] Display start time for Idle job [--qos] Display QOS [--user] Display owner of job [--group] Display group of job [--account] Display account of job [--nodes] Display number of nodes requested [--taskmap] Display tasks for each node of job [--systemqt] Display sytem queue time for job [--submitt] Display submit time for job [--reason] Display ineligible reason show_res This displays information on currently active reservations, ordered by start time. Options: [--?] Display available options [--help] Display available options [--full] Display the full set of default info. [--overlap] Display reservations that overlap other reservations. [--nodegrep] Display all reservations on each node. [--readable] Display dates in a readable form instead of epoch time. [--res=] Display the information for a specific reservation id. [--start] Display start date. [--end] Display end date. [--relstart] Display start time, relative to now, in hours. [--relend] Display end time, relative to now, in hours. [--duration] Display duration, in hours. [--nodes] Display number of nodes reserved. [--purpose] Display the type of reservation, job, standing, etc. [--comment] Display any comment associated with reservation. [--node_list] Display list of nodes reserved. [--job_rest] Display Python code for allowed jobs. Without options, show_res defaults to the following: show_res --readable --relstart --relend --duration --nodes --runID --purpose Options not described here: [--affinity_calculation] [--latency] [--node_rest] [--job] [--privacy] [--start_spec] query_priority Show priority calculation details for each job. For each element, provides a three-item string, colon-delimited: :: set_res Set a user reservation. Options: --start= Start time --end= End time --resource_amount= Number of nodes to reserve [--ignore_list=[,[,]...]] Reservation list to ignore [--node_list=[,[,]...]] Node list [--user_list=[,[,]...]] User list [--group_list=[,[,]...]] Group list [--account_list=[,[,]...]] Account list [--mode=] If 'real', create the reservation, if 'lookahead', just check to see if the reservation is possible [--affinity_calculation=] Described under create_res. The default for set_res is positive affinity, so that qualifying jobs should use nodes in the reservation before going to outside nodes. [--node_restriction=] | [--node_list=] | [--nodestate_list= --feature_list=] --node_restriction= Python code fragment for screening nodes for use in the reservation. 'result=0' means consider all nodes, regardless of state or configuration. Described under create_standing_res. --node_list= A comma-delimited list of nodes can be used to specify exactly where the reservation should be made. --nodestate_list= --feature_list= A comma-delimited list of acceptable node states can be specified (for example, Idle,Running,Busy). These are Catalina node states, not PBS or LoadLeveler node states. A comma-delimited list of acceptable node features can be specified (for example, batch,CPUs8). create_system_res Set a system reservation. No jobs can run, cuts across all other reservations. Options: --offset= | --start= --duration= [--resource_amount= | --node_list= create_standing_res Create a Standing Reservation instance. Options: --depth= Number of instances. --start_spec= Cron-like specification for the start of each instance. --duration= Duration of each instance --resource_amount= Number of nodes --job_restriction= Python code to select jobs to run in the reservation. [--mode=] [--node_restriction=] is a Python code fragment used to filter nodes to be used in the reservation. Arbitrary node attributes can be used to qualify a node for the reservation. input_tuple[0] is the node to be checked. If the result is set to 0, then the node is accepted. For example: --node_restriction="if input_tuple[0]['ConfiguredClasses'] != None and re.search('normal', input_tuple[0]['ConfiguredClasses']) : result = 0" will cause only nodes with normal class to be used. Otherwise, any node that is not Down, Drain, Drained, None, with Max_Starters not equal to 0 may be used. So, WH nodes would show up in an Interactive reservation. Options not described here: [--affinity_calculation] [--node_sort_policy] [--comment] update_standing_reservations Create standing reservation instances. cancel_standing_res Cancel a standing reservation. Options: --res= show_bf Show backfill windows for running a job. Options: [--?] Display available options [--help] Display available options [--username=] Username of job (defaults to current user) [--account=] Account of job (defaults to none) [--group=] Group of job (defaults to none) [--class=] Class of job (defaults to normal) [--qos=] QOS of job (defaults to 0) [--duration=] Wall Clock (sec) of job (defaults to 1 sec) update_system_priority Set or unset system priority for a job. Options: --job= Step id for job --system_priority= Level of system priority. 0 = no system priority. show_events Displays the event log in reverse chronological order. Currently, job start and cancel attempts are logged. The old event logs are archived in ARCHIVE_DIR as events. [--archive=] create_res This is the most powerful, least friendly command. It can be used to create a reservation. --duration= --resource_amount= [--earliest_start=] | [--resource_amount=] --earliest_start= if only resource_amount is provided, the earliest reservation for that amount of resource will be created. if only earliest_start is provided, the largest reservation, starting at that time or later will be created. [--latest_end=] reservation can end no later than this time, otherwise use the default limit of three months from now. [--del_res_id=] replace the old reservation with the newly created one. [--copy_res_id=] take default values from the existing reservation. This is a little tricky for some attributes, since they get changed during reservation creation. The start, end, and resource_amount values get generated from earliest_start, latest_end, and resource_amount depending on the other reservations in the system. When these defaults are applied in a new context, the resulting reservation may have different start, end and resource_amount values. To ensure that these values are appropriate, you may specify them with --earliest_start, --latest_end, and --resource_amount. [--job_restriction= | --job_restriction_file=] Python code, either on the command line or in a file, to be used to filter jobs for running within the reservation. input_tuple[0] is the job under consideration. Set 'result = 0' if the job is approved for run within the reservation. By default, 'result = 1', no job may run in the reservation. [--node_restriction= | --node_restriction_file=] Python code, either on the command line or in a file, to be used to filter nodes for allocation to the reservation. input_tuple[0] is the node under consideration. Set 'result = 0' if the node is approved for allocation within the reservation. By default, nodes in Down, Drain, Drained, None, Unknown, or with Max_Starters == 0 are rejected. All other nodes are accepted. [--conflict_policy= | --conflict_policy_file=] Python code, either on the command line or in a file, to be used to return open time windows for each node. input_tuple[0] is a list of accepted nodes. input_tuple[1] is the new reservation, with attributes 'earliest_start_float' and 'duration_float' in epoch time and seconds input_tuple[2] is the list of existing reservations. return a dictionary. Node names are keys. A list of tuples containing (float epoch start of window, float epoch end of window, node name) are the values. By default, only time windows that do not conflict with the existing reservations are returned. [--affinity_calculation= | --affinity_calculation_file=] Python code, either on the command line or in a file, to be used to set the affinity of a job for a reservation. This determines the tendency to schedule jobs towards or away from the reservation. input_tuple[0] is the job under consideration. If result is set to a negative number, the node will be avoided by the job, unless there is no alternative. If result is set to a positive number, the node will use the node in the reservation first, before considering other nodes. For overlapping reservations, affinities for jobs will be added. This is useful in situations where you want to put jobs on user reservations before using up Idle nodes (positive affinity). Or where you want to preserve free nodes in a reservation for as long as possible (negative affinity). Specifying affinity as python code instead of a number allows different affinities for different job attributes. By default, reservations are given positive affinity, so jobs will run on reserved nodes first. [--ignore_list=[,[,]...]] Reservation list to ignore. Use 'ALL' to specify that all known reservations are to be ignored. [--mode=] Determines whether reservation is really made (real) or just tested (lookahead). Objects: jobs LoadLeveler job steps. Partial list of attributes: - 'name' Name of job step - 'QOS' - 'user' owner of job - 'state' LoadL state of job - 'job_class' LoadL class of job - 'wall_clock_limit' seconds - 'account' LoadL account_no for job - 'group' LoadL group for job - 'adapter' Adapter for job - 'requirements' LoadL requirements line - 'allocated_hosts' (for running jobs) - 'fromhost' LoadL fromhost - 'cluster' LoadL cluster - 'proc' LoadL proc - 'Dispatch_Time' (for running jobs) - 'SubmitTime' epoch submitted - 'completion_time' epoch completed - 'comment' LoadL comment line - 'reservation_binding' list of reservations in which to run - 'initiatormap' tasks for each node requested - 'resource_amount_int' number of nodes requested - 'resource_list' names of suitable nodes - 'priority' priority of Idle job - 'system_queue_time' Time when job became eligible to run - 'ineligible_reason' Reason job is ineligible to run reservations Partial list of attributes: - 'name' name of reservation, based on time - 'job_restriction' Python code to filter jobs - 'duration_float' Seconds of reservation duration - 'resource_amount_int' Number of nodes in reservation - 'start_time_float' Actual start of reservation - 'end_time_float' Actual end of reservation - 'node_list' Names of nodes in reservation resources Partial list of attributes: - 'name' Name of node - 'Disk' Amount of Disk - 'Feature' LoadL Features - 'Machine' LoadL Machine name - 'Arch' Architecture - 'OpSys' Operating system - 'ConfiguredClasses_list' - 'AvailableClasses' - 'Adapter' - 'Max_Starters' - 'Memory' - 'Cpus' - 'State' Databases: CONFIGURATION_DB CONFIGURED_RESOURCES_DB EVENTS_DB JOBS_DB OLD_JOBS_DB OLD_RESERVATIONS_DB RESERVATIONS_DB RESOURCE_DB STANDING_RESERVATIONS_DB