(Graph of current load at Harvard RC)The Portable Batch System (PBS) and Load Sharing Facility (LSF) are popular job-schedulers for batch environments. Their goal is to manage computing jobs among the available computing resources. The figure on the right shows the current load of the Odyssey cluster at Harvard Research Computing. It demonstrates how computer jobs submitted by hundreds of researchers can be organized. I have some experience using both systems: the new Cray machine Lindgren in PDC and all clusters in Lunarc use PBS; while Odyssey at Harvard Research Computing uses LSF. This blog is trying to summarize the most useful commands on these systems, together with some sample submission scripts.

PBS: I will describe PBS first. Similar write-ups can be found here for Lindgren; while Lunarc provides both a quick start guide and a more detailed reference. Of course, PBSworks has a detailed manual but we will keep things simple here. The three most important commands for PBS are qsub, qdel, and qstat. The fist and second ones allow you to submit and delete jobs; the last one let you check your job status.

Suppose you have ssh-ed into a system using PBS, say Lindgren in our case; typing `qstat` lists all the submitted and running jobs:

user@lindgren:~$ qstat
Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
25647.nid01532            ihd,K=32         ckc             01:24:03 R gpu
25642.nid01532            ihd,K=64         ckc             01:24:06 R gpu
25637.nid01532            ihd,K=128        ckc             01:24:03 R gpu
26145.nid01532            ihd,K=256        ckc                    0 Q gpu
26145.nid01532            ihd,K=512        ckc                    0 Q gpu
26145.nid01532            ihd,K=1024       ckc                    0 Q gpu
26341.nid01532            gpen1            user            00:10:07 R batch
26343.nid01532            gpen2            user                   0 Q batch
...
27627.nid01532            sila-mri-ideal   mep             00:20:02 R batch
27628.nid01532            sila-mri         mep             00:20:00 R batch

The first column is the job id, you will need it in order to delete a job: qdel 25647. The second column is the name you can specify for a job. The third column is the user name. The fourth column, if the job already started, lists the time used by the job. The fifth “S” column shows the status. “Q” stands for queuing, “R” is for running, and “C” is for completed. Finally, the last column is the name of the queue. On Lindgren there is only one queue “batch”, but for other systems such Platon in Lunarc, you can submit to the “gpu” queue in order to run on GPUs.

Suppose you want to submit a new job, you will need to prepare a submission script. Submission script for PBS is a simple bash script:

#!/bin/sh

#PBS -N slimdisk
#PBS -l mppwidth=16
#PBS -l walltime=24:00:00
#PBS -e err.pbs
#PBS -o out.pbs

module add hdf5
cd $PBS_O_WORKDIR
aprun -n 16 ./zeusmp.x > out 2> err

The first line #!/bin/sh is not necessary. The lines start with #PBS are options that will be passed to qsub. For example, the line #PBS -N slimdisk:Mdot=10 is equivalent to use `qsub -N slimdisk:Mdot=10 ...`. Because of this, there cannot be any space in the arguments. The meaning of these options can be in the manual `man pbs_job_attributes`. The important ones are

  • -N: the name assigned to the job by the qsub or qalter command. Format: string up to 15 characters, first character must be alphabetic; default value: the base name of the job script or STDIN.
  • -l: resource list, a set of name=value strings. The most commonly used names are
    • mppwidth=number of processing elements
    • mppdepth=number of threads per processor
    • mppnppn=processing elements per node
    • walltime=maximum amount of real time, in format hh:mm:ss

    In the above example, we need to use 16 MPI-cores so we simply use #PBS -l mppwidth=16. The wall time is set to be one day #PBS -l walltime=24:00:00

  • -e, -o: error and output paths contain the job’s standard error and output streams. One can also use -j to join the error and output together.

After the #PBS options, we have a couple lines of simple bash script to start the job. `module add hdf5` loads the hdf5 module, i.e., set the correct search path. `cd $PBS_O_WORKDIR` moves us to the work-directory. Finally, `aprun -n 16 zeusmp.x > out 2> err` launch a 16 MPI-cores job, redirect the standard output to the file out and the standard error to the file err.

Once the submission script is ready, you can now submit it to the system.

user@lindgren:~$ qsub job.pbs
27645.nid01532
user@lindgren:~$ qstat -u user

nid01532:
                                                              Req'd  Req'd   Elap
Job ID               Username Queue    Jobname        ... TSK Memory Time  S Time
-------------------- -------- -------- -------------- ... --- ------ ----- - -----
27645.nid01532       user     batch    slimdisk       ...  --    --  24:00 Q   --

The option -u user makes qstat print only the job submitted by user. If you want to cancel the job for any reason, you can simply use

user@lindgren:~$ qdel 27645
user@lindgren:~$ qstat -u user

nid01532:
                                                              Req'd  Req'd   Elap
Job ID               Username Queue    Jobname        ... TSK Memory Time  S Time
-------------------- -------- -------- -------------- ... --- ------ ----- - -----
27645.nid01532       user     batch    slimdisk       ...  --    --  24:00 C 00:03

Note that the status column has switch from “Q” queuing to “C” canceling. It should disappear from the list after a few seconds.

LSF: I will write a similar summary for LSF later.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Set your Twitter account name in your settings to use the TwitterBar Section.