Bonjour,
J'ai lancé un run Dorado basecaller (Dorado_duplex_sup.sh) qui à mis deux jours avant d'avoir les ressources dispos et s'est interrompu au bout d'une minute ...
ci dessous le message d'erreur que j'ai du mal a décrypter (manque de memoire, ?)
Comment et jusqu’où est il raisonnable de réserver de la mémoire sur le GPU, le jeu de données d'entrée à basecaller fait 1.5 Tb ?
Merci de votre retour
Bonne journée
C
slurm-40390319.out =
Loading dorado/0.7.2
Loading requirement: singularity
[2024-06-28 08:01:41.743] [info] Running: "duplex"
"--emit-fastq" "sup"
"/shared/ifbstor1/projects/poacembly/Dacryo/Dacryodes_ONT.pod5"
[2024-06-28 08:01:42.502] [info] > No duplex pairs file
provided, pairing will be performed automatically
[2024-06-28 08:01:42.502] [info] - Note: FASTQ output is not
recommended as not all data can be preserved.
[2024-06-28 08:01:43.918] [info] - downloading
dna_r10.4.1_e8.2_400bps_sup@v5.0.0 with httplib
[2024-06-28 08:01:48.228] [info] - downloading
dna_r10.4.1_e8.2_5khz_stereo@v1.3 with httplib
[2024-06-28 08:01:53.709] [info] cuda:0 using chunk size 12288,
batch size 192
[2024-06-28 08:01:55.709] [info] cuda:0 using chunk size 10000,
batch size 1088
[2024-06-28 08:01:56.260] [info] > Starting Stereo Duplex
pipeline
[2024-06-28 08:01:56.281] [info] > Reading read channel info
[2024-06-28 08:02:31.007] [info] > Processed read channel
info
/shared/ifbstor1/software/singularity/wrappers/dorado/0.7.2/dorado: line
2: 34934 Killed singularity exec --nv /shared/ifbstor1/sof
tware/singularity/images/dorado-0.7.2.sif dorado $@slurmstepd:
error: Detected 1 oom-kill event(s) in StepId=40390319.batch.
Some of your processes may have been killed by the cgroup
out-of-memory handler.
Dorado.sh =
#!/bin/sh
#SBATCH -p gpu
#SBATCH --gres=gpu:3g.20gb:1
#SBATCH --cpus-per-task=1
#SBATCH --mem=16GB
#SBATCH --account=poacembly
## load dorado module
module load dorado
dorado duplex --emit-fastq sup
/shared/ifbstor1/projects/poacembly/Dacryo/Dacryodes_ONT.pod5
/shared/ifbstor1/projects/poacembly/Dacryo/Dacryodes_
ONT_dorado_sup.fastq