PyTorch¶

PyTorch from the software stack:¶

PyTorch with gpu-support is available in the software stack. You need to load the following modules to enable PyTorch:

module load 2024r1 openmpi py-torch

You can check for the availability of a gpu card with PyTorch with the following python script:

howmanygpus.py

import torch
cuda_avail = torch.cuda.is_available()
if cuda_avail:
    print("Torch CUDA is available")
    num_of_devices = torch.cuda.device_count()
    if num_of_devices:
        print("Number of CUDA devices: {}".format(num_of_devices))
        current_device = torch.cuda.current_device()
        current_device_id = torch.cuda.device(current_device)
        current_device_name = torch.cuda.get_device_name(current_device)
        print("Current device id: {}".format(current_device_id))
        print("Current device name: {}".format(current_device_name))
    else:
        print("No CUDA devices!")
else:
    print("Torch CUDA is not available!")

This script can be run with sbatch as follows:

howmanygpus.slurm

#!/bin/bash
#SBATCH --job-name="pytorch/howmanygpus"
#SBATCH --output=howmanygpus.out
#SBATCH --time=00:10:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --gpus-per-task=1
#SBATCH --partition=gpu-a100-small
#SBATCH --mem-per-cpu=1G
# make sure to add your account!
##SBATCH --account=<what>-<faculty>-<group>

module load 2024r1
module load openmpi
module load py-torch

srun python howmanygpus.py

If you have a more illustrative example that you would like to share, please post on mattermost or send it to info-DHPC@tudelft.nl.

Install your own PyTorch with conda:¶

You can install your own GPU-enabled PyTorch version with conda and pip in your /home directory, from a login node. To do so:

module load miniconda3
conda create -n mytorchgpuenv python=3.10.13
conda activate mytorchgpuenv
(mytorchgpuenv)$ pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128

After this, all you need in your submission script is to activate the conda environment. No need to load extra modules.