Skip to content

FAQ

  1. I am unhappy with my disk quota, job time limitations, the number of GPU nodes, etc. We are implementing these policies not in order to annoy users, but in order to be (and remain) useful for as many users as possible. The policies are not set in stone, though, and your feedback is very welcome so that we can keep improving the system. For more information on why we made these policies, see [here](DHPC-Policies.md).
  2. Is DelftBlue the right machine for me? It depends on what exactly you are trying to do. TU Delft provides a range of different services, including [personal workstations](https://tudelft.topdesk.net/tas/public/ssp/content/detail/service?unid=ea6fcf970be647b5a46d2a85cc1377ec), [physical or virtual machine servers](https://tudelft.topdesk.net/tas/public/ssp/content/detail/service?unid=71ba4c9678e041fd99dad8e7e11dd0e2), [cloud services](https://tudelft.topdesk.net/tas/public/ssp/content/detail/service?unid=339750cbd8b0419d8ad2cba53284480c), [faculty based local servers](https://hpcwiki.tudelft.nl/index.php/Introduction), and [many others](https://www.tudelft.nl/ict-innovation/articles/how-can-the-tu-delft-support-your-research-computing-needs). If you are looking for an IT solution, but not sure what exactly is the best option for you, talk to your [faculty's IT manager](https://intranet.tudelft.nl/en/-/faculty-it-manager)!
  3. I am new to Linux. Help! The basics of working with Linux and remote systems can be relatively easily self-learned with online materials. Start with checking out the relevant Software Carpentry courses: [Linux command line (basics)](https://swcarpentry.github.io/shell-novice/) [Linux command line (more advanced stuff)](https://carpentries-incubator.github.io/shell-extras/) [Introduction to High-Performance Computing](https://carpentries-incubator.github.io/hpc-intro/) We also prepared a super-short self learning page specific to DelftBlue for novices: [Crash-course on DelftBlue for absolute beginners](https://doc.dhpc.tudelft.nl/delftblue/crash-course) If you prefer a more formal, in-class education, we periodically organize a one-day [Linux Command Line Basics](https://www.tudelft.nl/cse/education/courses/linux-cli-101) course. Check out the [TU Delft Institute for Computational Science and Engineering (DCSE)](https://www.tudelft.nl/cse) web page for the upcoming events, or [subscribe for the DCSE newsletter](https://www.tudelft.nl/cse/contact/dcse-newsletter).
  4. How do I get access? By default, every employee with a `` is able to log in to DelftBlue and run jobs as a guest. However, your jobs will run with relatively low priority, and if the machine is busy, it may take a long time until they get scheduled. To be able to run jobs with high priority, you can request access to your [faculty's share](Accounting-and-shares.md). To get full access, please use the TU Delft [TOPdesk request form](https://tudelft.topdesk.net/tas/public/ssp/content/detail/service?unid=b7e2b7b46ac94cf688c21761aa324fc1).
  5. Where do I get help? For technical issues, please use the [TU Delft Self-Service Portal](https://tudelft.topdesk.net/tas/public/ssp) DelftBlue category under "Research support". For discussions about specific software, compilers, etc., as well as exchanging the knowledge with other users, please use the [DHPC Mattermost](https://mattermost.tudelft.nl/dhpc/) forum.
  6. I can not log in Are you a student? If yes, you have to explicitly request access [here](https://tudelft.topdesk.net/tas/public/ssp/content/detail/service?unid=b7e2b7b46ac94cf688c21761aa324fc1). Are you trying to access the right machine? DelftBlue is TU Delft's largest supercomputer; however, individual faculties and departments might still operate their own, local [servers and clusters](https://hpcwiki.tudelft.nl/index.php/Introduction). For example, the INSY department is operating their own machine, known as [HPC cluster (formerly knwon as INSY cluster)](https://login.hpc.tudelft.nl/). Make sure you are trying to connect to the right system! Check the correct address of DelftBlue (`login.delftblue.tudelft.nl`), the correct port (`22`), and the correct username (your ``). If you are using the SSH config file, check that it has been set up correctly. When you cannot make a connection (no prompt for username or password at all), check the [Remote access to DelftBlue](Remote-access-to-DelftBlue.md) page for the ways to access DelftBlue from the outside world. If you are outside of campus, use EduVPN! When you receive an authentication failure when you enter your password, check if you are using the correct `` and password. For example, try to login to the [TU Delft Webmail](https://webmail.tudelft.nl/). Finally, if all the above points are resolved, check the status of DelftBlue on the web page of [Delft High Performance Computing Centre](https://www.tudelft.nl/dhpc). Make sure the system is up, and there are no interruptions or planned maintenance.
  7. I can log in, but I don't have a `/home` folder
  8. I can log in, but I cannot submit a job You should be able to submit jobs to the queue using the `sbatch` command [see here](Slurm-scheduler.md). If there are any specific errors in your submission script, those should either be displayed in your terminal window, or in the slurm-XXX.out file. Please note that the OpenOnDemand web interface does not have a working job template at the moment! If you use the web interface, make sure to copy the correct `#SBATCH` commands from one of the [examples on the Wiki](Slurm-scheduler.md)!
  9. Why does my job not get scheduled? Every user with the `` is able to log in to DelftBlue. However, your jobs will run with very low priority, and if the machine is busy, they may not be scheduled at all. You can request full access via [this TOPdesk form](https://tudelft.topdesk.net/tas/public/ssp/content/detail/service?unid=b7e2b7b46ac94cf688c21761aa324fc1). If you are a registered user and your jobs still do not get scheduled within a day or so, the machine may either be too busy, or your [faculty's share](Accounting-and-shares.md) of computing time may be depleted for now, and all jobs from the faculty get lower priority.
  10. Can I login to the node(s) on which my job is running? Sometimes you may want to check how your job is running using commands like 'top', 'nvidia-smi', etc. In order to get a console "inside your slurm job", first find the job's ID using
    squeue --me
    
    and then start an interactive bash session as follows (here for 30 minutes):
    srun --jobid=<ID> --overlap --pty -t 00:30:00 bash
    
    See the [SLURM page](Slurm-scheduler.md#accessing-allocated-nodes) for details on interactive jobs.
  11. I can not find and/or load software modules We provide two versions of the software stack, typically you should pick the latest one: 2023r1 in this example:
    [<netid>@login03 ~]$ module avail
    
    ------------------------------------- /apps/noarch/modulefiles -------------------------------------
       2022r2
       2023r1
    
    To make the contents of the software stack 2023r1 available, use:
    module load 2023r1
    
    This system is based on [lmod](https://lmod.readthedocs.io/en/latest/). The module organisation is hierarchical. This means, that the modules you see depend on the ones you loaded: for example, if you don't load `openmpi` you don't see `hdf5`, etc. To find modules in the hierarchy, you can use the `module spider` command. Check out the [DHPC modules](DHPC-modules#delftblue-module-system-beta-phase) page for more info.
  12. After loading certain modules, the ``nano`` (editor) command does not work anymore This is caused by incompatible libraries in the search path. You can avoid it by putting the following line in your ``.bashrc`` file, and running ``exec bash`` to refresh the active session:
    alias nano='LD_PRELOAD=/lib64/libz.so.1:/lib64/libncursesw.so.6:/lib64/libtinfo.so.6 /usr/bin/nano'
    
  13. I compiled my code, but it only runs on some nodes Please be aware that `compute` nodes use Intel CPUs, while `GPU` nodes use AMD CPUs (see [DHPC Hardware page](DHPC-hardware.md#compute-nodes) for details). Depending on your application, you might need to compile two different versions of your program to run it on respective nodes. Please also note that available software modules for `compute` and `GPU` nodes might be different.
  14. I have difficulties with compiling a specific software package Many software packages are already available on DelftBlue as [software modules](DHPC-modules.md). Instructions for certain frequently used packages can be found in the [Howto Guides](howtos.md).
  15. Intel compiler takes forever or does not work at all TU Delft's license for Intel compiler suite allows for 5 users to compile their software simultaneously! While not a problem during normal use, this could cause some disruption during the beta phase. If your `ifort` compilation takes longer than expected, or does not start at all, try again in a bit. If it still does not work, contact us! [Click here for more information.](howtos/Intel-compilers.md)
  16. Slurm fails with out-of-memory (OOM) error You might encounter the following error when submitting jobs via `slurm`:
    slurmstepd: error: Detected 2 oom-kill event(s) in StepId=1170.0. Some of your processes may have been killed by the cgroup out-of-memory handler.
    
    You need to set the `--mem-per-cpu` value in the submission script. This value is the amount of memory in MB that `slurm` allocates per allocated CPU. It defaults to 1 MB. If your job's memory use exceeds this, the job gets killed with an OOM error message. Set this value to a reasonable amount (i.e. the expected memory use with a little bit added for some head room). Example: Add the following line to the submission script:
    #SBATCH --mem-per-cpu=1G
    
    Which allocates 1 GB per CPU. [Click here for more information.](Slurm-scheduler.md)
  17. My Intel-compiled code runs normally on login node, but is super slow upon submission to the queue For the `Intel MPI` to be able to operate correctly, it must be configured to work together with `Slurm`. Otherwise the Intel MPI may bin all requested threads to the same CPU. To configure the Intel MPI to work with Slurm, set Intel MPI to use the `Slurm PMI` interface, and use `srun` instead of `mpirun`. Use the following in your submission script:
    export I_MPI_PMI_LIBRARY=/cm/shared/apps/slurm/current/lib64/libpmi2.so
    
    Then invoke your binary with `srun` (so **not** with `mpirun`!)
    srun binary > output
    
  18. I cannot access my files on the TU Delft home, bulk, umbrella drives On the login and transfer nodes, these drives are mounted under /tudelft.net. On all other nodes, they are not directly accessible. If you get the error message 'permission denied' when accessing your directories under /tudelft.net on the login or transfer nodes, this is likely caused by an expired kerberos ticket. You can refresh your kerberos ticket by typing 'kinit' and entering your password upon request. Details can be found [here](Data-transfer-to-DelftBlue.md)