Skip to content

Interactive Jobs

Sometimes you may need to do pre- or postprocessing tasks that require interactive access. Testing and debugging your code is another example of work that may not be suitable for the shared login nodes.

Here we describe how to allocate and use computing resources interactively using SLURM. Specific information on visualization and other graphical use cases can be found here.

Note

Interactive jobs should be short in general as they tend to make very little use of the allocated resources. Specifying a short run time also leads to shorter waiting times on all our partitions. For interactive GPU jobs, please use as much as possible the gpu-a100-small partition. The GPUs there have 10GB of on-device memory, but there are much more of them than on a regular GPU node, so your job will be scheduled much faster.

Interactive jobs using SLURM

If you need to use a node of DelftBlue interactively, e.g. for running a debugger, you can issue the following command to get a terminal on one of the nodes:

srun <your-sbatch-commands> --pty bash 

where <your-sbatch-commands> are the #SBATCH lines from a regular submission script.

For example, to run an interactive (multi-threaded) job with eight cores for 30 minutes on a CPU node:

srun --job-name="int_job" --partition=compute --time=00:30:00 --ntasks=1 --cpus-per-task=8 --mem-per-cpu=1GB --pty bash

Accessing allocated nodes

A common use case of interactive sessions is logging in on the nodes where your batch job is running, e.g., to check the resource usage using 'top', or attach a debugger. This is possible by adding the --overlap and --jobid=<JOBID> flags to the above srun command, where <JOBID> can be found in the first column of squeue --me. In order to login on a specific node of the allocation, also add the --nodelist=<node> option.

For example, if my job with JOBID=11111 is running on a cmp111 node of DelftBlue, I can connect to this node by issuing the following command:

srun --pty --ntasks=1 --time=00:30:00 --overlap --jobid=11111 bash

This should bring me to the node cmp111, where my original job is running.

If my job with JOBID=22222 is running on several nodes of DelftBlue, for example cmp111 and cmp112, I can connect to a specific node by issuing the following command:

srun --pty --ntasks=1 --time=00:30:00 --overlap --jobid=22222 --nodelist=cmp112 bash

This should bring me to the node cmp112, one of the two nodes where my original job is running.

Interactive GPU jobs

To run an interactive job on a GPU node:

srun --mpi=pmix --job-name="int_gpu_job" --partition=gpu --time=01:00:00 --ntasks=1 --cpus-per-task=1 --gpus-per-task=1 --mem-per-cpu=4G --account=research-faculty-department --pty /bin/bash -il

Note

If you want to access the running GPU job using --overlap as described above, you have two options.

  1. You do not use srun to start your application in the original job script (meaning that you can only start a single task) and then add --gpus-per-task=1 to the srun --overlap command. Your interactive job will then "see" the GPU and you can inspect what is going on on it.
  2. You do not request a GPU in the secondary srun command. This means that the interactive session will be on the GPU node but not see the device(s) your job is using.

Interactive MPI jobs

To run an interactive MPI job on 16 cores (possibly scheduled across multiple nodes):

srun --job-name="int_mpi_job" --partition=compute --time=00:30:00 --ntasks=16 --cpus-per-task=1 --mem-per-cpu=1GB --account=research-faculty-department --pty bash -il
and then start your application using

srun --overlap <executable>