2

I have to schedule jobs on a very busy GPU cluster. I don't really care about nodes, more about GPUs. The way my code is structured, each job can only use a single GPU at a time and then they communicate to use multiple GPUs. The way we generally schedule something like this is by doing gpus_per_task=1, ntasks_per_node=8, nodes=<number of GPUs you want / 8> since each node has 8 GPUs.

Since not everyone needs 8 GPUs, there are often nodes that have a few (<8) GPUs lying around, which using my parameters wouldn't be schedulable. Since I don't care about nodes, is there a way to tell slurm I want 32 tasks and I dont care how many nodes you use to do it?

For example if it wants to give me 2 tasks on one machine with 2 GPUs left and the remaining 30 split up between completely free nodes or anything else feasible to make better use of the cluster.

I know there's an ntasks parameter which may do this but the documentation is kind of confusing about it. It states

The default is one task per node, but note that the --cpus-per-task option will change this default.

What does cpus_per_task have to do with this?

I also saw

If used with the --ntasks option, the --ntasks option will take precedence and the --ntasks-per-node will be treated as a maximum count of tasks per node

but I'm also confused about this interaction. Does this mean if I ask for --ntasks=32 --ntasks-per-node=8 it will put at most 8 tasks on a single machine but it could put less if it decides to (basically this is what I want)

1 Answer 1

2

Try --gpus-per-task 1 and --ntasks 32. No tasks per node or number of nodes specified. This allows slurm to distribute the tasks across the nodes however it wants and to use leftover GPUs on nodes that are not fully utilized. And it won't place more then 8 tasks on a single node, as there are no more then 8 GPUs available.

Regarding ntasks vs cpus-per-task: This should not matter in your case. Per default a task gets one CPU. If you use --cpus-per-tasks x it is guaranteed that the x CPUs are on one node. This is not the case if you just say --ntasks, where the tasks are spread however slurm decides. There is an example for this in the documentation.

Caveat: This requires a version of slurm >= 19.05, as all the --gpu options have been added there.

Sign up to request clarification or add additional context in comments.

2 Comments

I do use cpus_per_gpu do you know if that will mess anything up?
It shouldn't, as long as there are enogh free CPUs on a machine. If however a node has 8 free CPUs and 4 free GPUs and you use 4 cpus_per_GPU, only two tasks with access to will be started, so you'll only get 2 GPUs. Depending on the slurm configuration, you may even get less, as certain cores can be bound to certain GPUs, but I'm not quite sure about that.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.