Scheduler and jobs that Can Never Run

Dear Wizards,

In certain cases the scheduler may see that a job will never run. If too many resources are requested, like asking for ncpus per node > than what is available on any node in the system, or more nodes than is available etc. The number of nodes in a placement set (pcat, aka node group) may also be the cause of this.

In these cases, the scheduler sets the job comment to include something along the lines of:

comment = Can Never Run: can’t fit in the largest placement set, and can’t span psets

or

comment = Not Running: Insufficient amount of resource:

but the job itself just hangs around in the queue indefinitely.

On the other hand, if I impose a hard max on, say, the number of nodes (server or queue side) with e.g.

qmgr -c “set server resources_max.nodect = 3”

then qsub will fail (upon job submission) and the job will never enter the queue.

I assume that this is by design. Obviously, a very optimistic user could hope that somebody will install more nodes - or more cores per node etc. But in our case I might as well have the job be deleted (with some comment and/or log message) automagically.

I have not found a setting to do this, i.e. have the scheduler deem that this job will not ever run unless more resources are defined. The AdminGuide only states that " the job stays queued" (AG-124 §4.6.2), but does not say if something could be done about it.
Does anybody know if that is possible (or that it for sure is not?).

If it is not possible, then I’ll find a way around it - to look at the comments of queued jobs and explicitly fail jobs which have such comments. But it might be better, if a server/scheduler setting could fix it.

Hi Bjarne,
AFAIK there doesn’t exist any such feature, where server will cleanup the queues based on some criteria. But you can always write a hook for queuejob event, and reject any job while submission which request resource more than available in the complex.

But writing such hook could be a very tedious task since there are many resources that needs to be accounted for.

I think your idea of writing a script and removing jobs periodically, based on the set comment is best.

1 Like

Your assumption is spot on in this case. PBS Pro will not delete the job if there is the slightest possibility it could run at some point in the future. As @dilip-krishnan indicated, you could write a hook to address this. The goal should be to reject jobs at submission time if they don’t request a sane amount of resources. That way the user receives immediate feedback as opposed to having their job accepted, only to be deleted minutes later.

1 Like

Thank you for the input.

That sounds like a lot of fun. Hopefully, I could write a hook, which do not take into account everything at one. I’ll have to read up on hooks before going down that path.

Could be. However, that will mean to implement a system-type daemon (or at the very least a cron-based script) to do it. This is also not trivial - especially for other/future admins to figure out why the heck jobs are being removed… I’ll give it careful thoughts.

On hooks:

That would be a definite advantage, except that we do not have “traditional lusers” on our cluster. Rather, we run “operational” jobs, and we would much rather have a job fail (to try again later) than be stuck in the queue. If the job fails on submission or a few mins later is not a really big thing. As long as it fails.

Once again, your input is much appreciated. Thanks,

/Bjarne

Update,

After some testing, it appears that the comment may not be very well suited for this. Using the comment I cannot differentiate between jobs, which will run later (when resources are freed by presently running jobs), and jobs, which just ask for more resources, than what is available on any present node.
In both cases, the comment will look like:

Not Running: Insufficient amount of resource:

If I clearly request too many nodes (more than what is available totally), the comment becomes:

Can Never Run: Not enough total nodes available

where it is clearly feasible to delete the job.

Thus, I wonder about alternative ways to do this (still without actually writing a HOOK to run at submission):
As we run a small cluster with a very limited number of jobs in the queue at any time, I could estimate the job start time (set server est_start_time_freq) often (eg every scheduling cycle), and then purge jobs, which do not have a set estimated start time?

Any ideas?

Thanks,

/Bjarne

PS: Presently, there seems to be an issue with estimating the start times, see Qmgr -c 'set server est_start_time_freq ...' fails but hopefully that can be resolved.