What is Slurm? 

Many software research applications these days tend to consume large amounts of computer resources—CPU, GPU, or RAM. At SCI, there are a number of high-performance machines that provide substantial resources. We use a job scheduling infrastructure, based on Slurm, to manage access to those resources so that they can be used consistently and efficiently.

The general approach when using Slurm is that instead of manually starting processes that consume large amounts of resources, you submit a “job” to the scheduling infrastructure (Slurm), along with a specification of the resources needed to run the job (e.g., GPU, CPU, RAM). That infrastructure is then responsible for finding appropriate computer resources as soon as they become available and running the job on your behalf. This prevents computers from becoming overloaded and bogged down and eliminates the need to hunt around for free resources.

How does it work? 

There are several computers and services that comprise the Slurm infrastructure, there are really only two components that you need to be aware of when submitting jobs through Slurm: 

User Node– Instead of logging directly into individual computers to run jobs, the job scheduler is orchestrated from one central machine, which at SCI is compute.sci.utah.edu. When logged into this machine (via SSH), you have access to all the Slurm commands, such as sinfo, sbatch, etc. This is where you go to do your work and manage your jobs. 

Compute Server– Machines that have capacity to run jobs, i.e. have large amounts of CPU/RAM, GPUs, etc, the nodes that actually do all of the computing, are the compute servers.  As a user, you should be less concerned about what specific machine you want to run on and instead think about what resources you need to get work work done; Slurm figures out what are the best resources to use given the stated needs, and allocates/reserves those resources for your job and your job alone. And when those resources are available, your job is run. 

By orchestrating all of the jobs across all of the available resources, Slurm is able to make efficient use of the available compute power, particularly when dealing with long-running jobs that require a lot of capacity. 

How do I use it?

Any powerful tool requires a bit of learning, understanding, and familiarity to use well—and Slurm is no exception. There are many resources available online, but a basic quick-start guide is given here.

Overview

Once you SSH into compute, which is just a normal interactive Linux computer, you’ll have access to a number of commands that allow you to start and stop jobs, monitor running jobs and resource usage, and more. When a scheduled job completes, your program’s output will be logged to a file.

A few basic pieces of Slurm terminology: 

Node – an individual computer that provides some compute resources such as CPU, GPU and/or RAM
Partition – a group of nodes with similar sets of resources across which Slurm can schedule jobs
Job – a submitted task or set of tasks given to Slurm to be scheduled and run whenever resources are available; job steps are discrete (and possibly parallelized) tasks that make up a job
Queue – the list of jobs waiting to be scheduled

Basic Commands

There are many commands and options for using Slurm, which can feel overwhelming at first. However, the basics are relatively straightforward.

One of the first things you may want to do after logging into compute.sci.utah.edu is to see a list of available resources. This is done with the sinfo command:

Output of sinfo command

As you can see here, there are a number of partitions displayed, each of which are comprised of several nodes each of which have their own current state (e.g. idle, fully or partially allocated, and down/unavailable). Note that you will only see partitions that you have access to, so your results may vary from what is shown here. You can get detailed information about nodes by running scontrol show node <nodename>

To see the list of currently running jobs, use the squeue command:

Here, you can see that there are two jobs currently running—one on the node chimera and one on the node pegasus.

To run a job, use the sbatch command. You’ll need to write a file that is essentially a bash script with some special #SBATCH directives that tell Slurm how to run your job. While Slurm has many features, here is a simple example script that:

  • Prints the computer hostname and start time

  • Waits for 5 seconds

  • Prints the end time

  • Requests 1 GB of RAM and one GPU

#!/bin/bash
#SBATCH --job-name=chadtest
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --gres=gpu:1
#SBATCH --time=5
#SBATCH --mem=1G
#SBATCH --partition=titanrtx-24
#SBATCH --nodelist=atlas
#SBATCH --output=log_%J.txt
#SBATCH --mail-user=
srun echo "Running on:" $( hostname )
srun echo "Time start:" $( date )
srun sleep 5
srun echo "Time end  :" $( date )

 

And here is what running the job looks like:

As you can see, Slurm was asked to run this job on the wormulon-cpu partition, and it selected the available node wormulon1 to run it (as verified by the output of the hostname command). Note how the #SBATCH directive also logs the output to a file named log_%J.txt (%J is replaced by the Job ID assigned by Slurm).

What resources do I have access to?

The compute resources available at SCI change over time, as SCI IT or different research groups purchase or upgrade equipment. However, there are two partitions that are open to all SCI users: 

Wormulon– These machines are meant for learning and testing Slurm only. While they have general CPU/RAM/GPU resources, they are not very powerful and are not intended for anything computationally intensive…just a playground to try things out, debug, etc where resource contention would be limited. As such, the limits on how long jobs are allowed to run for are low. 

Spartacus– This collection of compute nodes is provided for general SCI use, so any user may use them at any time. 

There are other collections of compute resources in SCI that may be dedicated to a specific purpose or research group, and therefore not open to all SCI users; please consult with your research group for learning about what other resources may be available to you and how to get access to them. 

How do I interact with my running job?

When running commands from a command line, you have direct access to the running jobs- its output is written to your terminal, you can pause or terminate the process, etc. Since Slurm exists as an orchestration layer between the user (i.e. you) and the compute resources, the way you interact with running jobs is a bit different. The STDOUT of your processes get written to a log file (see the --output option for sbatch) and you need to use Slurm commands, such as squeue, which shows all currently running jobs, or scancel to stop a job, etc.

Another difference is that direct login access (i.e. via ssh) to compute nodes is restricted; it is important that the resources Slurm is configured to use remain free to be scheduled and assigned to jobs, so to ensure that conflicts don’t occur, users are not allowed to run anything computationally intensive directly on compute nodes.

There are a few others ways in which you can interact with your jobs that you may find helpful, however, in order to have the same or at least similar types of interactions with running processes.

Connecting to a running job

Once a jobs has scheduled and is happily chugging along running on its assigned resources, you can open up an interactive shell on the machine that the job is running on, but using srun and the --overlap argument, which will fire up a new process but utilizing the same resources as a running job.  For example:

Here you can see that I have fired up a long-running job using sbatch, and then started up a bash shell on the first or head node of the job. If you want to connect to a specific node your job is running on, specify its hostname using the -w argument (and notice the use of sacct to display the hosts my job is running on: 

Interactive jobs

Another way to have direct access to your running code is to create an interactive job, which is basically a job where the running process is a shell. To do so, you still go through the Slurm system, so that the appropriate resources are allocated and dedicated just to your use, but you have a shell to work from, like you would on a non-Slurm machine. For example:

Here you can see that I started up a job using srun for a reserved time of 5min, and the --pty bash argument says to run bash and hook up the terminal to the one I am currently in- the result being that I am given a shell (bash) that I can use to do whatever I like. From that shell you can start processes by hand, call them using srun to parallelize jobs across your allocated nodes, etc. This may be particularly helpful for shorter runs to debug your code, etc; longer running jobs may be easier to deal with when using tmux or screen

Additional Resources

  • Official Slurm User Documentation, especially the Quick Start User Guide

  • CHPC’s Slurm Documentation, including YouTube videos on Slurm Basics and Slurm Batch Scripting

  • CHPC also offers hands-on training for Slurm scripting. While the SCI and CHPC environments differ, the material is transferable between them.