A job resubmitted again and again...?


I submitted a job yesterday in the afternoon. It was supposed to be fast (sent as a dataset collection of two datasets, with an expected execution time of five minutes maximum more or less).
Seeing that it was still running after a while, I was a bit surprised, but since I noticed some uncommonly long running times on some of my jobs last week in a way that seemed random, I just let it for a while.
Be this morning I saw that it was still running... It took a closer look at it, and it seems the jb is resummitted again and again and again, explaining why it is still running...
Any idea what might happen here?

Complemetary info:

  • I used the same module on another dataset collection in another history, and it ran smoothly
  • It can happen sometimes that some of my job are killed "by an administrator". When it happens, I rerun it and can get the expected green result. Not sure it has anything to do with this issue but still saying in case there could be a link in a way or another.

"Still running" job metadata:

Job Information

Galaxy Tool ID: toolshed.g2.bx.psu.edu/repos/lecorguille/xcms_xcmsset/abims_xcms_xcmsSet/3.6.1+galaxy0
Galaxy Tool Version: 3.6.1+galaxy0
Tool Version: None
Tool Standard Output: stdout
Tool Standard Error: stderr
Tool Exit Code: None
History Content API ID: a4a98aad5abb77d6
Job API ID: 409cc46b87aae3f5

My bad!
It should be fixed soon with https://gitlab.com/ifb-elixirfr/usegalaxy-fr/infrastructure/-/merge_requests/150

Thanks @melpetera for the report

Done :tada:
Now the resubmit should really resubmit on another destination with more memory.

I also add set to 8GB, the memory available per job of xcmsset.

Note that it may have failed because when you submit, xcmsset only was allow to use 2MB. It then have been resubmitted with 3GB.
But if you relaunch your job, it will be able to use 8GB and then if needed 12GB.

It works!! :rainbow:
Thank you @gildaslecorguille :blush: