pbspro seems to treat a job array as a single job when attributes max_queued and queued_jobs_threshold applied. are there any ways to treat each subjob of job array as one job so that max_queued or queued_jobs_threshold can be applied to limit number of subjobs queuing? open source scheduler maui has this wonderful function.
The newer versions of PBS do treat subjobs as normal jobs for limits:
Qmgr: s s max_queued="[o:PBS_ALL=5]"
[ravi@pbspro ~]$ qsub -J 1-50 -- /bin/sleep 100
qsub: Maximum number of jobs already in complex
[ravi@pbspro ~]$ qsub -J 1-4 -- /bin/sleep 100
53[].pbspro
for job 365[], 72 subjobs are running and others are queuing on queue, physics.
I would expect that 10 subjobs were queuing in physics queue while others are queuing in defaultQ.
also, job 366[] is waiting in queue, defaultQ until all 365 subjobs completes even there are free cores. then all subjobs entered into queue, physics at once.
could you test this scenario with version you run please?
nowdays, running job array is very popular on HPC systems. if one user submits a job array with large numbers of subjobs, perhaps thousands, it would cause"queue stuffing". we could use fairshare to limit jobs running for each user, but there would be overheads.
They were both queued in defaultQ:
[ravi@pbspro ~]$ qstat -1n
pbspro:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
60[].pbspro ravi defaultQ STDIN -- 1 1 -- -- Q -- --
61[].pbspro ravi defaultQ STDIN -- 1 1 -- -- Q -- --
So, I guess PBS doesn’t allow an array job to be queued into the execution queue unless it can completely fit it, so they both stay in the routing queue. If I submit a smaller job from the same user, that does routed to physics and starts running:
[ravi@pbspro ~]$ qsub -J 1-5 -q defaultQ -- /bin/sleep 1000
62[].pbspro
[ravi@pbspro ~]$ qstat -1n
pbspro:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
60[].pbspro ravi defaultQ STDIN -- 1 1 -- -- Q -- --
61[].pbspro ravi defaultQ STDIN -- 1 1 -- -- Q -- --
62[].pbspro ravi physics STDIN -- 1 1 -- -- B -- --
job 60[] and 61[] above will never run? I would expect that 5 subjobs routed to queue physics would run. we strongly suggest the future pbspro would include such function for subjobs in job array as more and more users are running job array on HPC.
Apparently this function has many advantages such as speeding up the scheduling cycle in execution queues and preventing single user of stuffing execution queues.
open source scheduler maui has this function for many years and I am surprised that pbspro has not adopted it yet.
Well, they cross the max_queued limit, so ya, they won’t ever get queued to run. Would it be unreasonable to ask your users to submit smaller job arrays? As long as users submit arrayjobs within the limits, your use case will be achieved.
But thanks for mentioning this, we can analyze further whether it makes sense to enhance job arrays in PBS to be able to queue some of it into the execution queue if max_queued is set.
fundamental idea is that an array job as a subjob in job array should be treated equally as a normal job so that any functions with pbspro apply to normal jobs should also apply to array jobs. could you please pass my comments under this topic to your manager?
I think the new version of PBS does treat subjobs pretty much the same way as normal jobs. Can you please be more specific about what differences you see between a subjob and a normal job in PBS? The limits thing, as I explained earlier, has been fixed in the latest versions of PBS. Is there any other difference that you are concerned about?
If you are using 14.X then you might face the issue and it might not work the way you like.
If you download 19.x or the latest from the master as @agrawalravi90 suggested, that would work the way you like .
The behaviour you have seen in 14.x is a bug and it is fixed in version 19.x or the master branch.
Hi,
Please check the admin guide : https://www.altair.com/pdfs/pbsworks/PBS19.2.3_BigBook.pdf on section
Table 4-1: Server Attributes Involved in Scheduling max_queued: The maximum number of jobs allowed to be queued or running in the partition(s) managed by a scheduler. Can be specified for users, groups, or all.
A job array isn’t a collection of jobs. It’s one “job” that can spawn many jobs from it. This means that a subjob doesn’t exist until it starts running. We can’t detach N subjobs from the parent to move into the exec queue. They don’t exist yet. The whole job array has to move.
With this limitation, fixing it as you suggest is quite difficult.
I am not fully up to date on job arrays though. This might be easier than I think. Hopefully our job array expert (@Shrini-h) can weigh in.
I think @sxy has a valid expectation, unfortunately the current architecture doesn’t allow it.
I also agree with @Bhroam that its not an easy enhancement, but will be an interesting one to solve. there could be many ways to architect 1. decoupling parent job from queue (make it sort of global) and let subjobs freely move around queues, 2. Routing job array can spawn smaller dependent job arrays at the execution queue… etc