Skip to main content
Filter by
Sorted by
Tagged with
Advice
0 votes
3 replies
41 views

I am using a HPC system in which the folder /usr/ is not NFS. Therefore, the libraries installed in the master node do not seem available in the computation nodes, that is, if I ssh to a computer node ...
mancolric's user avatar
  • 111
1 vote
1 answer
117 views

I'm building a SLURM pipeline where each stage is a bash wrapper script that generates and submits SLURM jobs. Currently I'm doing complex job ID extraction which feels clunky: # Current approach ...
desert_ranger's user avatar
1 vote
0 answers
45 views

I'm trying to run the Neo4j Docker container using Singularity on an HPC system. The container starts successfully, but it shuts down automatically when I try to add data to the database (e.g., via ...
prasad's user avatar
  • 13
1 vote
1 answer
64 views

I have a mpi4py program, which runs well with mpiexec -np 30 python3 -O myscript.py at 100% CPU usage on each of the 30 CPUs. Now I am launching 8 instances with mpiexec -np 16 python3 -O myscript.py. ...
j13r's user avatar
  • 2,741
1 vote
0 answers
104 views

I’ve been using salloc to allocate compute nodes without issues before. Recently, after switching to another user account (same .bashrc config, only the conda path changed), salloc stopped working. I ...
Calculus007's user avatar
0 votes
0 answers
50 views

I need to migrate my work for geospatial processing (using mainly qgis processing and postgis functions from python scripts) to a HPC cluster. As neither qgis nor postgis are installed on the HPC I ...
Felix_geospatial's user avatar
0 votes
1 answer
196 views

I'm using Spack on Linux Mint to manage scientific libraries, including armadillo. I have installed Armadillo and its dependencies via Spack in an enviroment. Problem: When I run spack load armadillo, ...
jorge isaac rubiano's user avatar
0 votes
0 answers
58 views

I tried to use the sbatch file from this link (Running WindNinja on an HPC Cluster) to run the WindNinja software (WindNinja introduction) installed on HPC. However, it always produce the "...
Kaiyuan Zheng's user avatar
0 votes
0 answers
84 views

When users request 1-2 GPUs via sbatch --gres=gpu:1, Slurm locks the entire 8-GPU node. This fragments our cluster: Multiple small requests spread across nodes (e.g., four 1-GPU jobs occupy four ...
train-server's user avatar
0 votes
1 answer
60 views

I program in fortran with Intel OneAPI compiler ifx and MKL packages. I want to cal. the scalar product between a mass dim sparse matrix and a vector. When the dim of the sparse matrix could be ...
River Chandler's user avatar
0 votes
1 answer
84 views

I love snakemake and have used it locally as well as on HPC with SLURM! However, now we have a particular setup where it is not as easy to use snakemake as we have done before: We need to run some ...
Sebastian Beyer's user avatar
0 votes
0 answers
54 views

I'm learning UCX by creating a basic wrapper for both the client and server. I am using AM communication. When I run my client, I get below error : [1749297901.816001] [prateek:19822:0] ...
Prateek Joshi's user avatar
0 votes
0 answers
95 views

I'm trying to read different subsets of non-contiguous data from a file to different processes. Ie: I have a file with the data: a b c d e f g h i j and two processes who want to read the data from ...
Subject303's user avatar
1 vote
2 answers
100 views

I'm setting up IO for a largescale CFD code using the MPI library and the file IO is starting to eat into computation time as my problems scale. As far as I can find the "done" thing in the ...
Subject303's user avatar
0 votes
0 answers
52 views

I have a single computation node with 32 CPUs. I have defined two different partitions that both use this node. If I for example send two jobs on partition A requesting 20 CPUs and 25 CPUs, the second ...
Daniel's user avatar
  • 1
0 votes
1 answer
76 views

I want to run a pipeline on a cluster where the name of the jobs are of the form : smk-{config["simulation"]}-{rule}-{wildcards}. Can I just do : snakemake --profile slurm --configfile ...
Kiffikiffe's user avatar
1 vote
1 answer
104 views

When running snakemake on a cluster, and if we don't have specific requirements for some rules about number of cores/memory, then what is the difference between : Using the classic way, i.e. calling ...
Kiffikiffe's user avatar
0 votes
1 answer
97 views

I have a 12-core laptop (6 physical cores with hyperthreading) running Slurm for local job scheduling. When I submit job arrays requesting all 12 cores to be used simultaneously, Slurm consistently ...
desert_ranger's user avatar
1 vote
0 answers
88 views

I want to automate resource allocation in an HPC server's, node forwarding and open jupyterlab in the same node. Individually I have to go through the following steps: user@login1>salloc -A ...
Ep1c1aN's user avatar
  • 753
0 votes
0 answers
22 views

I got some MISTAKE when trying to bind the program with IntelMPI. #define _GNU_SOURCE #include <stdio.h> #include <unistd.h> #include <string.h> #include <sched.h> #include <...
user26958921's user avatar
0 votes
0 answers
86 views

I'm trying to make R scripts run on a HPC cluster (with SLURM workload manager), which need a specific package that I installed in a personal directory since I can't install packages in the server-...
legabgob's user avatar
0 votes
1 answer
41 views

I want to call genetic variants with DeepVariant on an HPC for about 1000 cereal lines. I successfully ran DV for one line with the docker image they provide using Apptainer/Singularity, but for the ...
skranz's user avatar
  • 65
6 votes
3 answers
260 views

I want to use the inclusive scan operation in OpenMP to implement an algorithm. What follows is a description of my attempt at doing so, and failing to get more than a tepid speedup. The inclusive ...
smilingbuddha's user avatar
0 votes
0 answers
148 views

I have Ollama version 0.5.13 installed on my university's HPC cluster. Because of lack of sudo access, I have a custom script that runs ollama for me. I am reproducing it below: # Set the custom ...
Ryan Hendricks's user avatar
0 votes
0 answers
47 views

Im creating a complete HPC architecture on AWS using service AWS PCS. In my cloud formation template literally all resource creation is successful but AWS PCS. Cluster: Type: AWS::PCS::Cluster ...
parthraj panchal's user avatar
0 votes
0 answers
76 views

I have a large .h5 file of high resolution images (~300MB each, 200 images per .h5 file) and need to load samples in python. The current setup uses a separate dataset for each sample. data_group....
gekrone's user avatar
  • 179
0 votes
1 answer
156 views

I'm trying to implement parallelization into a flowsolver code for my Phd, I've inherited a subroutine that is sending data between predefined subdomains. The subroutine is sending data throught the ...
Subject303's user avatar
0 votes
0 answers
95 views

Hi I'm trying to compile and run a .f90 code using the intel fortran compiler (ifx) and the intel mpi library on a linux HPC. I'm invoking the compiler through a .sh script with the following lines: ...
Subject303's user avatar
0 votes
0 answers
73 views

I am trying to solve a nonlinear optimization problem in AMPL. It is quite large but not ridiculously so. I solved a similar problem on my home PC (about 1 order of magnitude less in size though). I ...
apg's user avatar
  • 101
0 votes
0 answers
52 views

I have some software (AMPL) installed on my home folder on a Grid Engine based HPC cluster at a university. I'm looking just to source AMPL properly when I run my jobscript in the queue. I need to run ...
apg's user avatar
  • 101
0 votes
0 answers
45 views

I'm brand new to Linux / slurm / HPC so apologies if this seems trivial. I have access to a node, consisting of 4 GPUS, of a HPC. I have a job that when running on a single GPU runs out of memory so ...
Paul's user avatar
  • 41
0 votes
1 answer
56 views

Backstory: We are submitting an HPC job using the microsoft HPC pack 2019 SP3 SDK. HPC Doesn't natively support Active Directory gMSA accounts, so we obtain the gMSA account password via AD. The MSA ...
Jon Barker's user avatar
  • 1,838
0 votes
0 answers
21 views

enter image description here I have conducted experiments running the MLP (Multi-Layer Perceptron) algorithm on a PC cluster with Apache Spark, with configurations ranging from small data to large ...
Syahel Razaba's user avatar
0 votes
0 answers
67 views

Without an IDE, I can log in to an HPC interactive node by first sshing in to the server using: ssh servername Then I request an interactive node using qrsh # Sun Grid Engine # OR qsub -I # Slurm ...
David LeBauer's user avatar
0 votes
1 answer
128 views

Question I am trying to develop a clear mental model for using SLURM to request resources on HPC systems for hybrid MPI/OpenMP jobs. In thinking about it more, I realized there are some gaps in my ...
Jared's user avatar
  • 714
0 votes
0 answers
52 views

I am attempting to implement a method in MPI for a well established particle simulation program that involves image processing. The program runs a loop for millions of iterations that performs a ...
William Betancourt's user avatar
0 votes
1 answer
84 views

Background Let's say I have a complex MPI program with multiple message passing events and computations. The communication pattern is that of bidirectional ring messaging as shown in the figure below. ...
Nitin Malapally's user avatar
1 vote
0 answers
78 views

The following code example simply calls MPI_Barrier in a loop. On a 2 computer cluster of Intel machines, it runs correctly. When run from an Intel machine, with an AMD machine, it completes the first ...
Jeffrey Faust's user avatar
3 votes
1 answer
330 views

I have a Python script that processes approximately 10,000 FITS files one by one. For each file, the script generates an output in the same directory as the input files and creates a single CSV file ...
Falco Peregrinus's user avatar
1 vote
2 answers
145 views

I am recently learning some HPC topics and get to know that modern C/C++ compilers is able to detect places where optimization is entitled and conduct it using corresponding techniques such as SIMD, ...
PkDrew's user avatar
  • 2,301
1 vote
0 answers
99 views

I am facing issues with getting a free port in the DDP setup block of PyTorch for parallelizing my deep learning training job across multiple GPUs on a Linux HPC cluster. I am trying to submit a deep ...
Shataneek Banerjee's user avatar
1 vote
0 answers
72 views

I'm trying to use Hypre to solve a system of linear equations: #include <stdio.h> #include <stdlib.h> #include <string.h> #include <math.h> #include "HYPRE_krylov.h" #...
Huy Hoàng Nguyễn's user avatar
1 vote
0 answers
36 views

Half of my jobs I submit to my HPC return the following error message in the out file and ends my Job: /sw/rl8/zen/app/NetLogo/6.4.0-64/netlogo-headless.sh: line 34: 111089 Killed "$JAVA" &...
Bart de Bruin's user avatar
7 votes
3 answers
241 views

I am new to Openmp programming and I have a question regarding task parallelism on recursions Let's consider this demo C code: #include <stdio.h> #include <stdlib.h> #include <sys/time....
hpc_beginner's user avatar
0 votes
1 answer
556 views

Very simple question. I have access to a multi-node machine and I have to do some NCCL tests. In the readme it says If CUDA is not installed in /usr/local/cuda, you may specify CUDA_HOME. Similarly, ...
KansaiRobot's user avatar
  • 10.6k
0 votes
1 answer
69 views

This question is somehow similar with this one, Slurm: Use cores from multiple nodes for R parallelization But it is for python. I have a python program which can use multiple cores on a PC, it does ...
Quantum Monte Carlo's user avatar
1 vote
0 answers
48 views

I am running an MPI application on 32 processes. The stdout of the rank 0 process tgets sent to a separate file for startup error logging, we will call this file STARTUP_ERROR while the stdout of all ...
Defcon97's user avatar
  • 121
1 vote
0 answers
71 views

I have a particle simulation in C which is split over 4 MPI processes and running fast (compared to serial). However, one region of my implementation is N^2 complexity, where I need to compare each ...
Luna Morrow's user avatar
0 votes
0 answers
172 views

I have a small problem with a program that is run using MPI. I need to run this program on a HPC and I need to run it this way. The program needs to have 1 MPI process per node, use the number of ...
GRaiolo's user avatar
0 votes
1 answer
124 views

I’m new to Dask. I’m currently working in an HPC managed by SLURM with some compute nodes (those that execute the jobs) and the login node (which I access through SSH to send the SLURM jobs). I’m ...
Joseph Pena's user avatar

1
2 3 4 5
33