Still trying to get this to work, looked at “node_sort_key”, but I think that is more along the lines of what you would use to sort nodes if you had a bunch of nodes with varying configurations (ncpus, mem,) .etc.
I looked in the PBS Scheduler Config and saw “smp_cluster_dist” which seems exactly what I am trying to do, but it didn’t change the outcome.
Is there a required time-frame between job submissions? The reason I ask, is that the script that I am testing with actually runs through 20 or so jobs, creates the scripts, submits them, and then loops back to run the next job until all are done. So, when i submit, it creates 20 jobs, in a second or so.
I have made the changes and still, when i submit jobs, they are still being scheduled more like the “pack” method, where one node gets filled up to capacity before jobs spill over to the next node. I changed the "smp_cluster_dist: to “lowest_load” thinking that would help ,but still no good.
cat /opt/pbs/etc/pbs_sched_config | grep -v ‘#’ | grep -v -e ‘^$’
round_robin: False all
by_queue: True prime
by_queue: True non_prime
strict_ordering: false ALL
help_starving_jobs: true ALL
backfill_prime: false ALL
node_sort_key: “ncpus LOW” ALL
sort_queues: true ALL
resources: “ncpus, mem, arch, host, vnode, aoe, eoe”
load_balancing: true ALL
fair_share: true ALL
preemptive_sched: true ALL
preempt_prio: “express_queue, normal_jobs”