Allocation - AlphaFold 2

gb59 · Février 22, 2022, 8:51

Hi,

For a modeling, I ran out of memory on the GPU:

"W external/org_tensorflow/tensorflow/core/common_runtime/bfc_allocator.cc:457] Allocator (GPU_0_bfc) ran out of memory trying to allocate 24.27GiB (rounded to 26061494272)requested by op"

Is it possible to use a full GPU ? And is it possible to access other GPUs and not only the 3rd one ?

Best regards

jhaessig · Février 28, 2022, 2:38

Hi,
I repartitioned the GPUs on node 03 to have one full card per partition (40gb)
It should be possible to use the two GPUs on the machine but I doubt Alphafold can spread on several nodes. For the time being only node 03 is dedicated to Alphafold.

have a nice day,
JCH

gb59 · Mars 1, 2022, 9:16

Hi,

Thanks!

What to set in the --gres parameter to use one full card ? I tried --gres=3g:40gb:1 but got a node configuration error. And what to set in case one would like to use two GPUs ?

Best regards

dbenaben · Mars 10, 2022, 3:01

Hello,

The GRES parameter available are now:

1g.5gb
7g.40gb

So if you want to use one full gpu card:

#SBATCH --gres=gpu:7g.40gb:1

The nodes have 2 A100 gpu card, so you can try to use 2 cards:

#SBATCH --gres=gpu:7g.40gb:2

Have a nice day

gb59 · Mars 23, 2022, 10:25

Hi,

Thanks! It works well to run on a full gpu.

My problem now is that some jobs crash because of the walltime at 24h. Is it possible to extend to 2 or 3 days ?
Next to it, a new release of AlphaFold is available now (2.2.0) that allows to use new model parameters and GPU for the relaxation step (last step for each prediction), which saves computation time. Would it be possible to install this new version (2.2.0, with params to update as well), available on the AlphaFold's github ? This is a major update for the multimer predictions.

Cheers

dbenaben · Mars 30, 2022, 8:37

Hello,

The walltime is already to 3 days (TimeLimit 3-00:00:00) for the GPU.
You can use the partition long for the CPU.

Maybe the @team.alphafold can update AlphaFold but it will probably take some time.

gb59 · Mars 31, 2022, 10:08

Hi,

OK, thanks. I hadn't seen it was extended from one day to three. I am going to try.
Three days should be enough for monomer. For multimer, it is not sure for big predictions but it is something to try with the last version of AlphaFold, when the AlphaFold team finds time to install it.

Thanks again!