Difference between revisions of "Useful Slurm Commands"

From Rizzo_Lab
Jump to: navigation, search
(Created page with "==Sample qsub script== #!/bin/csh #PBS -l nodes=1:ppn=2 #PBS -l walltime=24:00:00 #PBS -N my_job_name #PBS -M user@ic.sunysb.edu #PBS -j oe #PBS -o pbs.out cd /nfs...")
(No difference)

Revision as of 15:14, 22 February 2024

Sample qsub script

#!/bin/csh
#PBS -l nodes=1:ppn=2
#PBS -l walltime=24:00:00
#PBS -N my_job_name
#PBS -M user@ic.sunysb.edu
#PBS -j oe
#PBS -o pbs.out  

cd /nfs/user03/sudipto
ls
echo "Hello World"
echo "My first seawulf job"

Basic PBS commands

  • qsub <script> - will submit your script to the queue
  • qstat - will display a list of jobs running
  • qdel <jobid> - remove the job from the queue
Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
514722.nagling            STDIN            liuxt12         1657:13: R batch
514723.nagling            STDIN            liuxt12         3366:45: R batch
514724.nagling            jet3dKn0.002PR20 wli             874:42:4 R batch
514725.nagling            ...dKn0.002PR100 wli             855:59:2 R batch
514803.nagling            jet3dKn0.002PR10 wli             596:48:0 R batch
514809.nagling            SAM              mriessen        542:37:3 R batch
514811.nagling            STDIN            justin          00:00:32 R batch
514815.nagling            latency_test     penzhang               0 Q batch
514822.nagling            flex_1.dock.csh  xinyu           34:26:46 R batch
514839.nagling            flex_2.dock.csh  xinyu           31:22:33 R batch
514856.nagling            flex_5.dock.csh  xinyu           26:33:35 R batch
514859.nagling            ....1_2.dock.csh xinyu           06:13:43 R batch
514862.nagling            p0.10            hjli            326:10:5 R batch
514863.nagling            p0.4             hjli            325:49:5 R batch
514864.nagling            p0.7             hjli            332:11:4 R batch
514884.nagling            ...1_p0.0_11.csh xinyu           17:16:57 R batch
514920.nagling            STDIN            lli             00:34:43 R batch

On seawulf the 'nodes' command shows a list of free nodes

       avail   alloc   down
nodes:  35      165     25
wulfie: 35      165     25

Use the man or info command in Unix to get more details of usage for these commands. You can use PBS commands inside your script. These are usually optional, but can be useful. There is an example script see NAMD on Seawulf

#PBS -l nodes=8:ppn=2
#PBS -l walltime=08:00:00
#PBS -N my_job_name
#PBS -M user@ic.sunysb.edu
#PBS -o namd.md01.out
#PBS -e namd.md01.err
#PBS -V

Using these PBS directives will name the job, the output files etc.

#PBS -j oe
#PBS -o pbs.out    

This will join the output and error streams into the output file. $PBS_O_WORKDIR is an environment variable that contains the path the script was submitted in. Usually you want to define a specific workdir and use that instead of relying on this variable.

Advanced Tricks

Delete all your jobs (either will work, check different versions of PBS)

qstat -u sudipto | awk -F. '/sudipto/{print $1}' | xargs qdel
qstat | grep sudipto | awk -F. '{print $1}' | xargs qdel 

Delete all your queued jobs only. Leaves all runnings jobs alone.

qstat -u sudipto | awk -F. '/00 Q /{printf $1" "}' | xargs qdel
qstat -u sudipto | awk -F. '/ Q   --/{printf $1" " }' | xargs qdel

List all working nodes in the queue

pbsnodes | egrep '^node|state =' | grep -B 1 'state = free' | grep ^node
set NODELIST=`pbsnodes | egrep '^node|state =' | grep -B 1 'state = free' | grep ^node`
foreach node ($NODELIST) 

/usr/local/torque-2.1.6/bin/qsub -l nodes=${NODE} ${HOME}/get.nodes.stats.csh done

Run on a particular type of node

 qsub -l nodes=1:beta:ppn=1 ${script}