SDSC Thread Graphic Issue 5, April 2006





RSS RSS Feed (What is this?)

User Services Director:
Anke Kamrath

Editor:
Subhashini Sivagnanam

Graphics Designer:
Diana Diehl

Application Designer:
Fariba Fana


Help Desk: User Questions

Frequently asked questions from our users

—Krishna Muriki


Dear SDSC Consulting,

I have ported my code for the first time to DataStar and I'm still in the mode of debugging my program (i.e) making my first runs on the DataStar machine. I submitted my job to the batch queue on DataStar and waited for long time only to find out that the job failed because of a small bug in my code. How can I get around the long queue wait time and debug my job quickly ?

Answer:

There is a very good solution to this query. When ever you are debugging your code, or making your first runs on the machines, you should use the interactive queues (or run in interactive mode), where the queue waiting time is very minimal. Scale down your code to smaller number of processors and less than 2 hrs run time and use the interactive queues. After you are confident that your code runs correctly, then move on to the batch queues and make your production runs.
For Example: On DataStar to use interactive queue, please login into dspoe.sdsc.edu node (NOT dslogin.sdsc.edu) change the name of the queue to interactive (#@class = interactive) in your job submission script and use 'llsubmit' to submit your job as usual.

Dear SDSC Consulting,

I was told to use the showq command on SDSC machines to estimate, the queue wait time after I submit my job. But after I submit my job using llsubmit (on datastar) or qsub (on IA-64 machine), my job doesn't show up in the output of showq?

Answer:

SDSC machines use an in-house job scheduling tool called 'catalina' along with the standard job management softwares namely the LoadLeveler on DataStar & BlueGene and PBS on the IA-64 machine. Both the software tools (catalina & the standard ones) maintain seperate tables with queue information. The 'showq' command shows the queue information maintained by the catalina tool and the commands like llq and qstat show the queue information maintained by the standard tools. Ideally both the tables should be contain same information at any instant, but the catalina tool is slow in updating its database tables when compared to LoadLeveler or PBS. So if you submit a new job or have just cancelled a job in the queue, the new information will be shown by the llq and qstat commands instantaneously but the showq command will show it with a small variable delay.

Krishna Muriki is reachable via email at kmuriki@sdsc.edu .

Did you know ..?

Login to the DSPOE node to run jobs in the DataStar "express" queue. There are 4 nodes set up to run up to 64 task 24/7 -Eva Hocks