Running jobs with Slurm
Slurm is a workload manager for clusters. It allocates resources to users for some duration of time, handle the execution of their jobs and prioritize them by managing a queue of pending work.
What you need :
-
data transferred to your home directory if needed
-
a program that will do the work
-
a script to prepare and run the above program with its parameters
Testing your environment
Before writing a script and submitting to Slurm, it is advised to test the environment in which your code will run with an interactive session.
You can obtain an interactive session this way :
[demo@datamaster ~]$ srun --partition debug -n 2 --mem 2G --pty /bin/bash
It is important to let Slurm know that resources are in used. If you use another partition than debug, don’t forget to set the time limit with -t 2:00:00 (2 hours).
|
If a GPU is required, add this argument : --gres="gpu:1"
|
How to submit a job
Setting up a job script
To submit a job, you must write a bash script that will prepare the environment and launch your software with its parameters.
In the following script run-tf-multigpu_cnn.bash :
-
We will set the reservation options with SBATCH comments
-
We will load an environment module that will give us access to Anaconda and TensorFlow
-
Finally, we execute the sample application multigpu_cnn.py from https://github.com/aymericdamien/TensorFlow-Examples
This example may be obsolete and needs to be checked |
#!/bin/bash
#SBATCH --job-name=multigpu_cnn (1)
#SBATCH --partition=gpu (2)
#SBATCH -N 1 (3)
#SBATCH -n 4 (4)
#SBATCH --mem=5G (5)
#SBATCH --gres="gpu:2" (6)
#SBATCH -t 1:00:00 (7)
#SBATCH --mail-user=your.name@umons.ac.be (8)
#SBATCH --mail-type=ALL (9)
# Loading Anaconda module
module load anaconda3
# Loading an Anaconda environment
conda source tensorflow-gpu-1.8
echo "DATE : $(date)"
echo "_____________________________________________"
echo " HOSTNAME : $HOSTNAME"
echo "_____________________________________________"
echo " CUDA_DEVICE_ORDER : $CUDA_DEVICE_ORDER"
echo "_____________________________________________"
echo " CUDA_VISIBLE_DEVICES : $CUDA_VISIBLE_DEVICES"
echo "_____________________________________________"
nvidia-smi -L
echo "_____________________________________________"
# Starting the Python program and printing the time it took to complete
time python3 $HOME/multigpu_cnn.py
1 | A name for the job |
2 | the partition |
3 | the number of servers to use |
4 | the number of CPU |
5 | the maximum memory it will need |
6 | if you need to use a GPU, here we reserve 2 of them |
7 | a 1 hour time limit |
8 | the mail address for notifications |
9 | the type of notification that will trigger an email |
This bash script must be executable, so we set it with chmod command :
$ chmod a+x run-tf-multigpu_cnn.bash
Checking available resources
You can display informations about the available queues with sinfo :
[demo@datamaster ~]$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
days up 2-00:00:00 6 idle hpc[1-6]
week up 7-00:00:00 6 idle hpc[1-6]
month up 31-00:00:0 6 idle hpc[1-6]
gpu up 1-00:00:00 3 idle deep[1-2],simu1
lgpu up 7-00:00:00 3 idle deep[1-2],simu1
debug up 4:00:00 9 idle deep[1-2],hpc[1-6],simu1
Meaning :
For the "days" partition, the job time limit is maximum 2 days.
It contains 6 nodes from hpc1 to hpc6 and they are all in the idle state waiting for submissions.
The "gpu" and "lgpu" partitions are available for running short and long jobs on servers with GPU’s for Deep Learning.
Finally, "debug" is a short lived partition that can be used for testing.
Submitting your job
Since we have set the sbatch options in the run-tf-multigpu_cnn.bash, we don’t have to specify them on the command line.
[demo@datamaster ~]$ sbatch run-tf-multigpu_cnn.bash
Submitted batch job 5336
Otherwise, we could have done :
[demo@datamaster ~]$ sbatch --partition=gpu -N 1 -n 4 --mem=5G --gres="gpu:2" -t 1:00:00 run-tf-multigpu_cnn.bash
Submitted batch job 5336
For more information on the available options :
[demo@datamaster ~]$ man sbatch
Verifying its state
[demo@datamaster ~]$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
5336 gpu run-tf-m demo R 0:10 1 simu1
The job is Running on simu1 node since 10 seconds on the gpu partition.
An other state could be PD for pending.
If RESOURCE is displayed, it means that your job is waiting for resources to be freed.
We can view the output of the job by printing out the content from the slurm-5336.out that is created in the current directory.
[demo@datamaster ~]$ cat slurm-5336.out