Reverse node sort key in single queue

Hello All,

I currently have a small cluster that has multiple generations of nodes available. Crossing generational boundaries is not an issue for us in many cases, so I have a node sort key set to prefer the newest nodes first in queues that span generations. I am trying to figure out if there is a method to reverse this sort order for a single queue. The documentation says the sort key can only be applied globally, does anyone know if it’s possible to affect node sorting only for a single queue somehow? If not, is it possible to alter the node selection method with a hook? Or any other workaround to create a similar behavior?

Thank you,

Will

Node sorting is indeed global. There is no way to sort nodes differently between queues. What is possible is the use of placement sets. When you use placement sets, you can force the scheduler to run a job on only on generation of nodes. This can be done on a per-queue basis.

The way to do this is to create a new node level resource (let’s call it model) and set it on each of your nodes. You should set it to a string representing the model. You then set ‘node_group_key=model’ and ‘node_group_enable=True’ on your queue.

After this, jobs should only run on one model. The only caveat is if a job is submitted that can’t fit on the largest model, it will run over all nodes.

Bhroam

To add to what Bhroam is saying about placement sets, they are sorted smallest to largest so you could add “dummy”, offline nodes with the label of your oldest nodes to make it the largest, then more to make the next oldest the second largest set, etc and end up with the newest being in the smallest set. The scheduler would try to fit the job on the newest nodes (smallest set) first, then the next oldest (nest smallest set) and so on.

Thank you, Bhroam and Sam. I will implement this and test.

Just an observation: this is one of the primary use cases for the long-standing “per-queue scheduling policy” RFE.

A new feature covering this would be great. In the mean time, this method worked for my use case.

Thanks all,

-Will