Memory allocation failure alphafold

Hi,

I get the following error while running Alphafold 2.3.2:

2023-06-17 03:43:50.734422: W external/org_tensorflow/tensorflow/tsl/framework/bfc_allocator.cc:290] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.12GiB with freed_by_count=0.

Please find below my script:

#!/bin/bash

#SBATCH -p gpu
#SBATCH --gres=gpu:1g.5gb:11
#SBATCH --cpus-per-task=45
#SBATCH --mem=450G
#SBATCH -A protein_structure_prediction_abca4
#SBATCH --time 3-00:00:00

module load alphafold/2.3.2

mkdir -p /tmp/apereira_alphafold

srun run_alphafold.sh --fasta_paths=/shared/projects/protein_structure_prediction_abca4/sequences/hABCA4_201_AA_WT.fasta
--output_dir=/shared/projects/protein_structure_prediction_abca4/output
--model_preset=monomer
--db_preset=full_dbs
--data_dir=/shared/bank/alphafold2/current/2.3/
--uniref90_database_path=/shared/bank/alphafold2/current/2.3/uniref90/uniref90.fasta
--mgnify_database_path=/shared/bank/alphafold2/current/2.3/mgnify/mgy_clusters_2022_05.fa
--pdb70_database_path=/shared/bank/alphafold2/current/2.3/pdb70/pdb70
--template_mmcif_dir=/shared/bank/alphafold2/current/2.3/pdb_mmcif/mmcif_files
--max_template_date=2020-12-31
--obsolete_pdbs_path=/shared/bank/alphafold2/current/2.3/pdb_mmcif/obsolete.dat
--bfd_database_path=/shared/bank/alphafold2/current/2.3/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt
--uniclust30_database_path=/shared/bank/alphafold2/current/2.3/uniref30/UniRef30_2021_03
--use_gpu_relax=false

I cannot understand where the error comes from. Are the memory demands insufficient? Am I not using the GPU to its full capacity?

Thanks a lot for your help.

Best wishes,

Allwyn

Bonjour,

En effet, je pense que la mémoire GPU est insufisante pour votre traitement (ran out of memory).

Vous utilisez un profil 1g.5gb avec 5Go de mémoire GPU, essayer avec le profil 3g.20gb ou 7g.40gb.
Plus d'infos: https://ifb-elixirfr.gitlab.io/cluster/doc/slurm/slurm_at/#gpu-instance-profiles

Bonne journée