Unable to finish job error

Hello,

I am using the bioinformatics pipeline FROGS to process a ~25 GB input file. I have tested the workflow with a smaller dataset and it is successful, but has failed multiple times on the clustering step with my full 25 GB dataset. I am wondering if this is due to either a size or memory limit enforced in Galaxy. The error messages are not informative, the first time the job failed I got an error that said “the job was terminated because it ran longer than the maximum allowed job run time” (it ran for a little over a day), and when I attempted to rerun the job a couple of times it still failed but the only error message is “unable to finish job”.

For lack of any other ideas on why this workflow succeeds with a smaller dataset but fails with a larger one, I am wondering if there is a processing/size quota that I am hitting. So far I have not exceeded my 100 GB quota with this job as far as I can see. I am hoping someone can give me a hint as to whether this error is related to Galaxy or more likely to be an issue with my workflow/dataset.

Thank you,

Heather

Hello, can you share your history with me?

Thomas

Hi Thomas,

I deleted my original history while I was clearing out some space and trying another run. I have been attempting to run it again today with the same parameters, and it is now failing even earlier in the process than before with similar errors. But I have not changed the workflow or the inputs. I sent you a direct message with a link to the history, please let me know if you are able to determine what the error could be.

Thank you!

Bonjour, je me trouve dans une situation similaire :

J’utilise FROGS pour des données ITS de métabarcoding, j’ai demandé une augmentation d’espace de stockage qui m’a été accordé au début du mois car mes jeux de données étaient trop lourds donc j’ai pas l’impression que ce soit un problème de stockage car j’en ai beaucoup pour le coup. Lorsque je lance le pré-process après chargement des données, l’outil indique une erreur : Unable to finish job → An error occurred while running the tool toolshed.g2.bx.psu.edu/repos/frogs/frogs/FROGS_preprocess/4.1.0+galaxy3.

J’ai essayer de lancer mon workflow avec une toute petite partie du jeu de données et visiblement ça fonctionne …
Est-ce-que quelqu’un a une idée pour traiter le jeu de données dans son ensemble car j’aimerai ne pas avoir à le partitionner ?

Merci d’avance pour l’aide !

Pour plus d’informations, le rapport d’erreur envoyé est le suivant :

Error Localization

Dataset 13300888 (4ab2eadf5e74d5446653b4587a417400)
History 1245998 (95c050ccac3393ac)
undefined ----
Failed Job 44: FROGS_1 Pre-process: dereplicated.fasta (4ab2eadf5e74d544ac929c310c29f171)
undefined ----

Detailed Job Information

Job environment and execution information is available at the job info page.

Job ID 7189278 (ed87f7a2291d75a9)
Tool ID toolshed.g2.bx.psu.edu/repos/frogs/frogs/FROGS_preprocess/4.1.0+galaxy3
undefined ----
Tool Version 4.1.0+galaxy3
undefined ----
Job PID or DRM id 63768054
undefined ----
Job Tool Version None
undefined ----

Job Execution and Failure Information

Command Line

preprocess.py 'illumina' --output-dereplicated '/shared/ifbstor1/galaxy/datasets2/jobs/007/189/7189278/outputs/dataset_eae0211c-24a3-436a-9fed-f232793137d3.dat' --output-count '/shared/ifbstor1/galaxy/datasets2/jobs/007/189/7189278/outputs/dataset_a2458ea7-f11c-46e8-ac95-ae8ce3886586.dat' --summary '/shared/ifbstor1/galaxy/datasets2/jobs/007/189/7189278/outputs/dataset_ff5bad1d-06c9-4e7c-b7b3-d0a1bf62a59b.dat' --nb-cpus ${GALAXY_SLOTS:-1} --min-amplicon-size 50 --max-amplicon-size 600 --five-prim-primer 'CTTGGTCATTTAGAGGAAGTAA' --three-prim-primer 'GCATCGATGAAGAACGCAGC'  --input-archive '/shared/ifbstor1/galaxy/datasets2/d/6/d/dataset_d6d9a225-a958-44ac-a360-78001171e7a8.dat' --R1-size 350 --R2-size 350 --mismatch-rate 0.1 --merge-software vsearch --keep-unmerged

stderr


stdout


Job Information

Unable to finish job

Job Traceback


Traceback (most recent call last):
  File "/shared/ifbstor1/galaxy/server/lib/galaxy/jobs/runners/__init__.py", line 680, in _finish_or_resubmit_job
    job_wrapper.finish(
  File "/shared/ifbstor1/galaxy/server/lib/galaxy/jobs/__init__.py", line 2131, in finish
    import_model_store = store.get_import_model_store_for_directory(
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/shared/ifbstor1/galaxy/server/lib/galaxy/model/store/__init__.py", line 1483, in get_import_model_store_for_directory
    raise Exception(
Exception: Could not find import model store for directory [/shared/ifbstor1/galaxy/datasets2/jobs/007/189/7189278/metadata/outputs_populated] (full path [/shared/ifbstor1/galaxy/datasets2/jobs/007/189/7189278/metadata/outputs_populated])