Interactive Jobs¶
Sometimes you may need to do pre- or postprocessing tasks that require interactive access. Testing and debugging your code is another example of work that may not be suitable for the shared login nodes.
Here we describe how to allocate and use computing resources interactively using SLURM. Specific information on visualization and other graphical use cases can be found here.
Note
Interactive jobs should be short in general as they tend to make very little use of the allocated resources.
Specifying a short run time also leads to shorter waiting times on all our partitions.
For interactive GPU jobs, please use as much as possible the gpu-a100-small
partition.
The GPUs there have 10GB of on-device memory, but there are much more of them than on a regular GPU node,
so your job will be scheduled much faster.
Interactive jobs using SLURM¶
If you need to use a node of DelftBlue interactively, e.g. for running a debugger, you can issue the following command to get a terminal on one of the nodes:
where <your-sbatch-commands>
are the #SBATCH
lines from a regular submission script.
For example, to run an interactive (multi-threaded) job with eight cores for 30 minutes on a CPU node:
srun --job-name="int_job" --partition=compute --time=00:30:00 --ntasks=1 --cpus-per-task=8 --mem-per-cpu=1GB --pty bash
Accessing allocated nodes¶
A common use case of interactive sessions is logging in on the nodes where your batch job is running, e.g., to check the resource usage using 'top', or attach a debugger. This is possible by adding the --overlap
and --jobid=<JOBID>
flags to the above srun
command, where <JOBID>
can be found in the first column of squeue --me
.
In order to login on a specific node of the allocation, also add the --nodelist=<node>
option.
For example, if my job with JOBID=11111
is running on a cmp111
node of DelftBlue, I can connect to this node by issuing the following command:
This should bring me to the node cmp111
, where my original job is running.
If my job with JOBID=22222
is running on several nodes of DelftBlue, for example cmp111
and cmp112
, I can connect to a specific node by issuing the following command:
This should bring me to the node cmp112
, one of the two nodes where my original job is running.
Interactive GPU jobs¶
To run an interactive job on a GPU node:
srun --mpi=pmix --job-name="int_gpu_job" --partition=gpu --time=01:00:00 --ntasks=1 --cpus-per-task=1 --gpus-per-task=1 --mem-per-cpu=4G --account=research-faculty-department --pty /bin/bash -il
Note
If you want to access the running GPU job using --overlap
as described above, you have two options.
- You do not use srun to start your application in the original job script (meaning that you can only start a single task)
and then add
--gpus-per-task=1
to thesrun --overlap
command. Your interactive job will then "see" the GPU and you can inspect what is going on on it. - You do not request a GPU in the secondary
srun
command. This means that the interactive session will be on the GPU node but not see the device(s) your job is using.
Interactive MPI jobs¶
To run an interactive MPI job on 16 cores (possibly scheduled across multiple nodes):
and then start your application using