Hello,
I am using the bioinformatics pipeline FROGS to process a ~25 GB input file. I have tested the workflow with a smaller dataset and it is successful, but has failed multiple times on the clustering step with my full 25 GB dataset. I am wondering if this is due to either a size or memory limit enforced in Galaxy. The error messages are not informative, the first time the job failed I got an error that said “the job was terminated because it ran longer than the maximum allowed job run time” (it ran for a little over a day), and when I attempted to rerun the job a couple of times it still failed but the only error message is “unable to finish job”.
For lack of any other ideas on why this workflow succeeds with a smaller dataset but fails with a larger one, I am wondering if there is a processing/size quota that I am hitting. So far I have not exceeded my 100 GB quota with this job as far as I can see. I am hoping someone can give me a hint as to whether this error is related to Galaxy or more likely to be an issue with my workflow/dataset.
Thank you,
Heather
Hello, can you share your history with me?
Thomas
Hi Thomas,
I deleted my original history while I was clearing out some space and trying another run. I have been attempting to run it again today with the same parameters, and it is now failing even earlier in the process than before with similar errors. But I have not changed the workflow or the inputs. I sent you a direct message with a link to the history, please let me know if you are able to determine what the error could be.
Thank you!
Bonjour, je me trouve dans une situation similaire :
J’utilise FROGS pour des données ITS de métabarcoding, j’ai demandé une augmentation d’espace de stockage qui m’a été accordé au début du mois car mes jeux de données étaient trop lourds donc j’ai pas l’impression que ce soit un problème de stockage car j’en ai beaucoup pour le coup. Lorsque je lance le pré-process après chargement des données, l’outil indique une erreur : Unable to finish job → An error occurred while running the tool toolshed.g2.bx.psu.edu/repos/frogs/frogs/FROGS_preprocess/4.1.0+galaxy3.
J’ai essayer de lancer mon workflow avec une toute petite partie du jeu de données et visiblement ça fonctionne …
Est-ce-que quelqu’un a une idée pour traiter le jeu de données dans son ensemble car j’aimerai ne pas avoir à le partitionner ?
Merci d’avance pour l’aide !
Pour plus d’informations, le rapport d’erreur envoyé est le suivant :
Error Localization
Detailed Job Information
Job environment and execution information is available at the job info page.
Job Execution and Failure Information
Command Line
preprocess.py 'illumina' --output-dereplicated '/shared/ifbstor1/galaxy/datasets2/jobs/007/189/7189278/outputs/dataset_eae0211c-24a3-436a-9fed-f232793137d3.dat' --output-count '/shared/ifbstor1/galaxy/datasets2/jobs/007/189/7189278/outputs/dataset_a2458ea7-f11c-46e8-ac95-ae8ce3886586.dat' --summary '/shared/ifbstor1/galaxy/datasets2/jobs/007/189/7189278/outputs/dataset_ff5bad1d-06c9-4e7c-b7b3-d0a1bf62a59b.dat' --nb-cpus ${GALAXY_SLOTS:-1} --min-amplicon-size 50 --max-amplicon-size 600 --five-prim-primer 'CTTGGTCATTTAGAGGAAGTAA' --three-prim-primer 'GCATCGATGAAGAACGCAGC' --input-archive '/shared/ifbstor1/galaxy/datasets2/d/6/d/dataset_d6d9a225-a958-44ac-a360-78001171e7a8.dat' --R1-size 350 --R2-size 350 --mismatch-rate 0.1 --merge-software vsearch --keep-unmerged
stderr
stdout
Job Information
Unable to finish job
Job Traceback
Traceback (most recent call last):
File "/shared/ifbstor1/galaxy/server/lib/galaxy/jobs/runners/__init__.py", line 680, in _finish_or_resubmit_job
job_wrapper.finish(
File "/shared/ifbstor1/galaxy/server/lib/galaxy/jobs/__init__.py", line 2131, in finish
import_model_store = store.get_import_model_store_for_directory(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/shared/ifbstor1/galaxy/server/lib/galaxy/model/store/__init__.py", line 1483, in get_import_model_store_for_directory
raise Exception(
Exception: Could not find import model store for directory [/shared/ifbstor1/galaxy/datasets2/jobs/007/189/7189278/metadata/outputs_populated] (full path [/shared/ifbstor1/galaxy/datasets2/jobs/007/189/7189278/metadata/outputs_populated])