Hello,
I'm trying to run AlphaFold at the beginning of the running of the script I'm getting this error
I0420 13:57:32.087283 47661089406656 xla_bridge.py:230] Unable to initialize backend 'tpu_driver': Not found: Unable to find driver in registry given worker:
I0420 13:57:33.208311 47661089406656 xla_bridge.py:230] Unable to initialize backend 'tpu': Invalid argument: TpuPlatform is not available.
And the script aborts with these error message
2022-04-20 14:13:44.792693: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_blas.cc:226] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2022-04-20 14:13:44.792754: F external/org_tensorflow/tensorflow/compiler/xla/service/gpu/gemm_algorithm_picker.cc:113] Check failed: stream->parent()->GetBlasGemmAlgorithms(&algorithms)
Fatal Python error: Aborted
Current thread 0x00002b58f64dbec0 (most recent call first):
File "/shared/ifbstor1/software/miniconda/envs/alphafold-2.1.1/lib/python3.8/site-packages/jax/interpreters/xla.py", line 474 in backend_compile
File "/shared/ifbstor1/software/miniconda/envs/alphafold-2.1.1/lib/python3.8/site-packages/jax/interpreters/xla.py", line 863 in compile_or_get_cached
File "/shared/ifbstor1/software/miniconda/envs/alphafold-2.1.1/lib/python3.8/site-packages/jax/interpreters/xla.py", line 921 in from_xla_computation
File "/shared/ifbstor1/software/miniconda/envs/alphafold-2.1.1/lib/python3.8/site-packages/jax/interpreters/xla.py", line 892 in compile
File "/shared/ifbstor1/software/miniconda/envs/alphafold-2.1.1/lib/python3.8/site-packages/jax/interpreters/xla.py", line 759 in _xla_callable_uncached
File "/shared/ifbstor1/software/miniconda/envs/alphafold-2.1.1/lib/python3.8/site-packages/jax/linear_util.py", line 263 in memoized_fun
File "/shared/ifbstor1/software/miniconda/envs/alphafold-2.1.1/lib/python3.8/site-packages/jax/interpreters/xla.py", line 687 in _xla_call_impl
File "/shared/ifbstor1/software/miniconda/envs/alphafold-2.1.1/lib/python3.8/site-packages/jax/core.py", line 627 in process_call
File "/shared/ifbstor1/software/miniconda/envs/alphafold-2.1.1/lib/python3.8/site-packages/jax/core.py", line 1635 in process
File "/shared/ifbstor1/software/miniconda/envs/alphafold-2.1.1/lib/python3.8/site-packages/jax/core.py", line 1623 in call_bind
File "/shared/ifbstor1/software/miniconda/envs/alphafold-2.1.1/lib/python3.8/site-packages/jax/core.py", line 1632 in bind
File "/shared/ifbstor1/software/miniconda/envs/alphafold-2.1.1/lib/python3.8/site-packages/jax/_src/api.py", line 416 in cache_miss
File "/shared/ifbstor1/software/miniconda/envs/alphafold-2.1.1/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162 in reraise_with_filtered_traceback
File "/shared/ifbstor1/software/miniconda/envs/alphafold-2.1.1/lib/python3.8/site-packages/alphafold/model/model.py", line 167 in predict
File "/shared/ifbstor1/software/miniconda/envs/alphafold-2.1.1/bin/run_alphafold.py", line 193 in predict_structure
File "/shared/ifbstor1/software/miniconda/envs/alphafold-2.1.1/bin/run_alphafold.py", line 403 in main
File "/shared/ifbstor1/software/miniconda/envs/alphafold-2.1.1/lib/python3.8/site-packages/absl/app.py", line 258 in _run_main
File "/shared/ifbstor1/software/miniconda/envs/alphafold-2.1.1/lib/python3.8/site-packages/absl/app.py", line 312 in run
File "/shared/ifbstor1/software/miniconda/envs/alphafold-2.1.1/bin/run_alphafold.py", line 427 in <module>
/shared/ifbstor1/software/miniconda/envs/alphafold-2.1.1/bin/run_alphafold.sh: line 3: 62559 Aborted (core dumped) python /shared/ifbstor1/software/miniconda/envs/alphafold-2.1.1/bin/run_alphafold.py "$@"
srun: error: gpu-node-01: task 0: Exited with exit code 134
Submission script is located there /shared/home/akiselev2/aphanoclust/SSP/AlphaFold/AF_test_042022.sh
Slurm output is there /shared/home/akiselev2/aphanoclust/SSP/AlphaFold/slurm-22072247.out
Thank you in advance
Andrei