Best practices for running high-granularity benchmark [closed]

Question

Closed. This question is opinion-based. It is not currently accepting answers.

Want to improve this question? Because this question may lead to opinionated discussion, debate, and answers, it has been closed. You may edit the question if you feel you can improve it so that it requires answers that include facts and citations or a detailed explanation of the proposed solution. If edited, the question will be reviewed and might be reopened.

Closed 9 months ago.

Improve this question

I am trying to run a benchmark on some family of algorithms.

I have multiple algorithms, each of them with one hyperparameter, and I want to test them with multiple data sizes. Each run takes ~60 seconds, but there is a high cardinality, hence the need for a cluster.

At the moment, I am submitting one job with one task for each benchmark run, but I don't know if that is a good practice. The number of runs is way higher than the number of jobs I can have currently on the queue.

Perhaps I should submit multiple "runs" in one job even if they have different hyperparameters? Should I do it then as multiple tasks in that job?

damienfrancois · Accepted Answer · 2025-04-03 07:54:13Z

3

60 seconds for a job is very short-lived ; you should probably "pack" benchmarks run together in a single submission script, for instance by algorithm, with a submission script like this (4 CPUs used for each benchmarks) :

#!/bin/bash

#SBATCH --ntasks=1
#SBATCH --ncpus-per-task=4
#SBATCH --mem-per-cpu=...
#SBATCH ...

module load ...

algorithm="thealgorithm"
hyperparametersvalues=(0.1 1 10 1000)
files=(data/*)

for hyper in hyperparametersvalues
do
  for file in files; do
      ./benchmak_script $algorithm --hyperparameter=$hyper $file
  done
done

If you have access to GNU parallel, you can rewrite it like this, that easily allows running benchmarks for the same algorithm in parallel (on a single node):

#!/bin/bash

#SBATCH --ntasks=10
#SBATCH --ncpus-per-task=4
#SBATCH --nodes=1-1
#SBATCH --mem-per-cpu=...
#SBATCH ...

module load ...

algorithm="thealgorithm"
hyperparametersvalues=(0.1 1 10 1000)
files=(data/*)

parallel -P $SLURM_NTASKS ./benchmak_script $algorithm --hyperparameter={1} {2} ::: ${hyperparametersvalues[@]} ::: ${files[@]}

If you do not have parallel, you can achieve the same with the & and wait in the loop.

You can also use multiple nodes and release the --nodes=1-1 constraint by inserting a srun --exact ... in the argument of parallel.

You can also create a job array with the algorithm as parameter:

#!/bin/bash

#SBATCH --ntasks=10
#SBATCH --ncpus-per-task=4
#SBATCH --nodes=1-1
#SBATCH --mem-per-cpu=...
#SBATCH ...
#SBATCH --array=0-2

module load ...

algorithms=(thealgorithm thesecondalgorithm thethirdalgorithm)
algorithm=${algorithms[$SLURM_ARRAY_TASK_ID]}
hyperparametersvalues=(0.1 1 10 1000)
files=(data/*)

parallel -P $SLURM_NTASKS parallel ./benchmak_script $algorithm --hyperparameter={1} {2} ::: ${hyperparametersvalues[@]} ::: ${files[@]}

answered Apr 3 at 7:54

damienfrancois

60.5k9 gold badges116 silver badges128 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

David Davó Apr 3 at 7:59

Thank you! I think this is what I was looking for. I tried the job array before but it counts as different job to the "jobs in the queue count".

damienfrancois Apr 3 at 11:40

You are welcome. Feel free to accept my answer then so others know this question is answered.

Collectives™ on Stack Overflow

Best practices for running high-granularity benchmark [closed]

1 Answer 1

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Related