Reverse node sort key in single queue

wtcolson · October 16, 2019, 3:37pm

Hello All,

I currently have a small cluster that has multiple generations of nodes available. Crossing generational boundaries is not an issue for us in many cases, so I have a node sort key set to prefer the newest nodes first in queues that span generations. I am trying to figure out if there is a method to reverse this sort order for a single queue. The documentation says the sort key can only be applied globally, does anyone know if it’s possible to affect node sorting only for a single queue somehow? If not, is it possible to alter the node selection method with a hook? Or any other workaround to create a similar behavior?

Thank you,

Will

bhroam · October 17, 2019, 12:17am

Node sorting is indeed global. There is no way to sort nodes differently between queues. What is possible is the use of placement sets. When you use placement sets, you can force the scheduler to run a job on only on generation of nodes. This can be done on a per-queue basis.

The way to do this is to create a new node level resource (let’s call it model) and set it on each of your nodes. You should set it to a string representing the model. You then set ‘node_group_key=model’ and ‘node_group_enable=True’ on your queue.

After this, jobs should only run on one model. The only caveat is if a job is submitted that can’t fit on the largest model, it will run over all nodes.

Bhroam

smgoosen · October 17, 2019, 5:56pm

To add to what Bhroam is saying about placement sets, they are sorted smallest to largest so you could add “dummy”, offline nodes with the label of your oldest nodes to make it the largest, then more to make the next oldest the second largest set, etc and end up with the newest being in the smallest set. The scheduler would try to fit the job on the newest nodes (smallest set) first, then the next oldest (nest smallest set) and so on.

wtcolson · October 17, 2019, 11:26pm

Thank you, Bhroam and Sam. I will implement this and test.

sgombosi · October 22, 2019, 11:25pm

Just an observation: this is one of the primary use cases for the long-standing “per-queue scheduling policy” RFE.

wtcolson · October 28, 2019, 9:00pm

A new feature covering this would be great. In the mean time, this method worked for my use case.

Thanks all,

-Will

Topic		Replies	Views
PP-1018: Design document review of Placement set sorting feature Developers	2	1065	November 6, 2017
Ordering nodes at queue level Users/Site Administrators	1	457	March 31, 2022
Can we take control of placement set selecting order? Users/Site Administrators	1	893	May 2, 2018
New node placement algorithm Developers	3	915	May 12, 2018
Ordering of Nodes Users/Site Administrators	1	632	April 19, 2022

Reverse node sort key in single queue

Related topics