How can I scatter (i.e. round-robin) multiple jobs over the vnodes?
With our current configuration, when I submit 5 jobs that uses a little amount of shared resource, they are assigned to a single vnode together:
vnode1: job1, job2, job3, job4, job5
vnode2: (vacant)
vnode3: (vacant)
I want them to be assigned to different vnodes so as to avoid speeddown caused by resource contention:
vnode1: job1, job4
vnode2: job2, job5
vnode3: job3
I know I can scatter multiple chunks in a single job, but I failed to find the way to scatter multiple independent jobs.
Any comments or suggestions would greatly be appreciated.
Thank you,
I think node_sort_key can help you. This option can be set in $PBS_HOME/sched_priv/sched_config:
node_sort_key: â<resource> LOW assignedâ ALL
Do not forget to kill -HUP the scheduler after saving the file. More info on node_sort_key can be found in the admin guide: â4.8.50.1 node_sort_key Syntaxâ.
I think add place line after your select line in jobscript
i.e. #PBS -l place=scatter
would also do this if you want to control specific jobs rather than global configuation
When I tried e.g. doing
$ qsub -lselect=ncpus=1 -lplace=scatter test.sh
three times, then those jobs were placed on to a single machine, which was what I did not wanted.
Thank you for your comment.
â:exclâ prevents multiple jobs to be assigned to a single node (even if the node has enough amount of resource) when I submit more jobs than the number of nodes.
Did you get this done? I am moving to OpenHPC w/ PBS Pro from Rocks Cluster and was working Queueâs and have the same question. Currently everything gets piled up on C1 before moving to C2,3,4⌠I would also like to have jobs distributed accross nodes evenlyâŚ
âqsub: Cannot be used with select or place: nodesâ
I think that is a different setting that I am not really looking for. I donât want to set it within the script anyway, It should be set globally for how jobs are scheduled across the nodes.
Not the same, some configuration needs systemctl restart pbs
Please restart it when no jobs are running on the system, otherwise, job would be killed or requeued.
[updated]
Please check these sections of this guide: https://www.altair.com/pdfs/pbsworks/PBS19.2.3_BigBook.pdf
Chapter 7 Starting & Stopping PBS
Table 7-2: Commands to Start, Stop, Restart, Status PBS
Table 7-3: MoM Restart Options
Still trying to get this to work, looked at ânode_sort_keyâ, but I think that is more along the lines of what you would use to sort nodes if you had a bunch of nodes with varying configurations (ncpus, mem,) .etc.
I looked in the PBS Scheduler Config and saw âsmp_cluster_distâ which seems exactly what I am trying to do, but it didnât change the outcome.
Is there a required time-frame between job submissions? The reason I ask, is that the script that I am testing with actually runs through 20 or so jobs, creates the scripts, submits them, and then loops back to run the next job until all are done. So, when i submit, it creates 20 jobs, in a second or so.
I have made the changes and still, when i submit jobs, they are still being scheduled more like the âpackâ method, where one node gets filled up to capacity before jobs spill over to the next node. I changed the "smp_cluster_dist: to âlowest_loadâ thinking that would help ,but still no good.
If you have only one compute node in the PBS Cluster
then all the jobs have to be packed within that node
If you have more than one compute node , then you can see the jobs being scheduled on to another node when you have node_sort_key: âncpus LOW assignedâ ALL (kill -HUP < pid of the pbs_sched > based on the ncpus resources allocation.
Please share the output of pbsnodes -aSj and qmgr -c âp sâ