Hi,
I'm trying to send a new job and it is not working either when using sbatch
or running a jupyter notebook on demand. If I check the queue the NODELIST(REASON) states: (launch failed requeued held)
Best,
Hi,
I'm trying to send a new job and it is not working either when using sbatch
or running a jupyter notebook on demand. If I check the queue the NODELIST(REASON) states: (launch failed requeued held)
Best,
Hi,
I didn’t see any of your jobs stuck. Could you try again?
I ran it again with the same results, I removed the previous ones. I see other users have the same issue.
Indeed, there was an error with a node (cpu-node-17
) and launches on this node failed.
It has now been fixed (the node has been drained), and the jobs have been released.
Thank you for reporting this. Let me know if you’re still experiencing any issues.