Demande information utilisation gpu et CUDA

ssafistibler · Mai 13, 2025, 10:46

Dans le cadre d'analyse de données Hi-C avec Juicer il est nécessaire que j'utilise un GPU pour faire tourner HiCCUPS et ArrowHead.

Dans le script que j'utilise, j'ai paramétré l'utilisation du GPU de cette manière :

	#SBATCH --partition=gpu
	#SBATCH --account=humancells_hmga_ko_dynamics_hi_c
	#SBATCH --gres=gpu:3g.20gb:1

Lorsque je lance mon script, lors de l'appel de HiCCUPS, j'ai ce message d'erreur :


load: spack load cuda@8.0.61 arch= && CUDA_VISIBLE_DEVICES=0,1,2,3
Tue 13 May 2025 12:05:15 PM CEST

HiCCUPS:

GPUs are not installed so HiCCUPs cannot be run

(-: Postprocessing successfully completed, maps too sparse to annotate or GPUs unavailable (-:
Tue 13 May 2025 12:05:15 PM CEST

De plus, j'ai aussi cette mention dans un autre fichier de log HiCCUPS :

nvcc: command not found

J'ai donc quelques questions :

Est-ce que mes paramètres d'utilisation du GPU vous semblent corrects ?
Est-ce que le fait que nvcc ne soit pas trouvé peut expliquer le fait que les GPUs sont indiqués comme n'étant pas installés ? Malgré le fait que HiCCUPS et ArrowHead semblent utiliser la partition GPU lorsqu'ils runnent si j'en crois les informations obtenues avec la commande squeue :

JOBID       STATE         NAME                                            NODELIST(REASON)
47431621  RUNNING   a1747130685_arrowhead_wrap    gpu-node-02

Je vous remercie par avance.

-----------------------------------------------------------------------------------------

Hi @team.software,

When analyzing Hi-C data with Juicer, I need to use a GPU to run HiCCUPS and ArrowHead.

In the script I'm using, I've set the GPU usage as follows:

	#SBATCH --partition=gpu
	#SBATCH --account=humancells_hmga_ko_dynamics_hi_c
	#SBATCH --gres=gpu:3g.20gb:1

When I run my script, when calling HiCCUPS, I get this error message:

load: spack load cuda@8.0.61 arch= && CUDA_VISIBLE_DEVICES=0,1,2,3
Tue 13 May 2025 12:05:15 PM CEST

HiCCUPS:

GPUs are not installed so HiCCUPs cannot be run

(-: Postprocessing successfully completed, maps too sparse to annotate or GPUs unavailable (-:
Tue 13 May 2025 12:05:15 PM CEST

Moreover, I also have this mention in another HiCCUPS log file :

nvcc: command not found

So I have a few questions:

Do my GPU usage settings seem correct to you?
Can the fact that nvcc is not found explain why the GPUs are indicated as not installed? Despite the fact that HiCCUPS and ArrowHead seem to use the GPU partition when they run if I believe the information obtained with the squeue command :

JOBID STATE NAME NODELIST(REASON)
47431621 RUNNING a1747130685_arrowhead_wrap gpu-node-02

Thank you in advance.

gildaslecorguille · Mai 20, 2025, 12:18

Hi @ssafistibler ,

(We can speak French or English, up to you)

Which module or software are you using?
CUDA_VISIBLE_DEVICES=0,1,2,3 is strange since you reserved only 1 card gpu:3g.20gb:1
- Can you set the device number in your software parameters?

ssafistibler · Mai 20, 2025, 2:16

Hi @gildaslecorguille ,

We can continue in english as it might be useful for other users.

I need GPU in order to run at least HiCCUPS (HiCCUPS · aidenlab/juicer Wiki · GitHub), it's script is here: juicer/SLURM/scripts/juicer_hiccups.sh at main · aidenlab/juicer · GitHub

In order to run this script inside the juicer script, there is a call for nvcc which is the CUDA Compiler Driver (1. Introduction — NVIDIA CUDA Compiler Driver 12.9 documentation).
But it seems that this command is not disponible in cudatoolkit, or not named the same way maybe?

I can give you all modules I use, but HiCCUPS (and maybe Arrowhead also) is the only one that needs a GPU to works.

It is not possible to set the device number in my software parameter. I modified the juicer.sh script in order to add the following line in the SLURM job parameter settings :

#SBATCH --partition=gpu
#SBATCH --account=humancells_hmga_ko_dynamics_hi_c
#SBATCH --gres=gpu:3g.20gb:1

Like in the setting for HiCCUPS calling :

#!/bin/bash -l
	#SBATCH -p $queue
	#SBATCH --mem-per-cpu=16G
	#SBATCH -o $debugdir/arrowhead_wrap-%j.out
	#SBATCH -e $debugdir/arrowhead_wrap-%j.err
	#SBATCH -t $queue_time
	#SBATCH --ntasks=1
	#SBATCH --partition=gpu
	#SBATCH --account=humancells_hmga_ko_dynamics_hi_c
	#SBATCH --gres=gpu:3g.20gb:1
	#SBATCH -J "${groupname}_arrowhead_wrap"
	${sbatch_wait}

Apparently it is also possible to set the number of visible devices using this parameter line:
#SBATCH --export=CUDA_VISIBLE_DEVICES=[number_of_device].

If I am right, the line

load: spack load cuda@8.0.61 arch= && CUDA_VISIBLE_DEVICES=0,1,2,3

Is used in a part of the script refered as "Aiden Lab specific check". It is at line 121 in the juicer.sh script. So I am not sure it needs to be taken into account?

The "Aiden Lab specific check" needs to use spack to be ran but this command/module is not disponible in the Core Cluster environment apparently. Anyway, I do not really need it for the script to work.