Node grouping - config problems

Hello Bjarne,
Let me provide some assistance. I’ll refer to A-D in your original post.

First off, the flag you want to set is do_not_span_psets (qmgr -c ‘s sched do_not_span_psets=True’). This will make sure no job spans placement sets. Such jobs will get a ‘Can Never Run’ comment. This enforces A.

I’d suggest against using the compute/io resources. Users who want to request compute nodes can just request the placement set resource (nodetype in your case). They’d request qsub -l select=N:nodetype=compute. You can also add a default_chunk.nodetype=io. Doing so will break D though. The placement set sort should be sufficient, so setting the default_chunk should not be required. This enforces B.

PBS enforces a smallest set first sort on placement sets. There is no way to change this sort. Now with that being said, you can play with the scheduler’s world view a bit and affect the sort. This is what wgy was getting at. You create fake vnodes (with ncpus and memory) and then offline them. They won’t be used, but they will be considered in the calculation for the placement set sort. If at any point your io placement set grows larger than your compute set, you can create fake vnodes and add them to the compute set. This enforces C and D.

The reason your io/compute booleans were not working was because of the way you requested your job. Requesting qsub -lresource=True is requesting the resource at the queue or server level. I suspect you didn’t set a resources_available.io at the queue level. This would default the boolean to false. The correct way to make your request is qsub -lselect=N:io=True. I suggest against using the old nodes syntax. The select syntax is much more powerful.

I hope this helps!
Bhroam

1 Like