Cluster Job Submission

Much computational research is done using a computing cluster which can be many times more powerful than your local laptop or desktop. Rarely will you have physical access to a computing cluster so instead you have to login remotely. In order to do this see the section on SSH.

When you long onto a cluster you log into the head node. The head node controls communication between the cluster nodes and job submission. You do not want to actually run code on the head node, though you can do simple tasks like viewing files or text editing.

On a cluster you can’t just run code in a terminal like you do on your local machine. The cluster needs to balance the needs of the many people who may be running programs at the same time. Instead you put the commands you want to run in a script and then you submit your job to a que which will execute the job when the resources are available. If you just want to work on the terminal you can open an interactive job with

qsub -I

If you want the job to run while you go off and do something else you can write a script telling the cluster what to do. Once you have your script ready you can submit it with

qsub example.pbs

your script should like something like this

#!/bin/sh
#PBS -N jobname
#PBS -l nodes=1:ppn=1
#PBS -j oe

echo "Starting job"
cd $PBS_O_WORKDIR
command
echo "Job finished"

where command is the command you want to execute typed just like you would on the command line. You can execute many commands if you’d like. If you want to use multiple processors you need to set ppn to that number and nodes to the number of nodes. If you are using mpi for parallelization you’ll need to replace command with

mpiexec -n 8 command

where the number is the number of processors you want to use.

Modules

The cluster has the ability to have different versions of commonly used programs using the environmental modules system. With this you can load different versions which you might need for compiling certain programs.  To see what modules are available use:

module avail

To see what modules have been loaded use:

module list

To load or unload a module or get rid of all loaded module use:

 module load my_module
module unload my_module
module purge

Comments are closed.