@billnitzberg asked me to provide more use cases for this feature, based on what our users are doing now. Here’s a quick bit of background on how our users request resources, then I’ll get to the use cases. Users pick from one of roughly 7 primary types of nodes using resources_available.model. If they have a need for some specialized subset of a primary type, e.g. bigmem, they request that resource as well. In some cases we have different bigmem sizes, so a user might additionally specify resources_available.mem to request the minimum mem size needed. Then we have onesy/twosey special nodes that might have a different resources_available.model, etc.
uc3) Multi-chunk job that picks a higher memory node for the head node and cheaper (lower accounting charge rate) nodes for the rest:
qsub -l select=1:model=bro:mpiprocs=4+100:model=san:mpiprocs=16
uc4) Multi-chunk job that picks all nodes from the same type, but places fewer MPI ranks on the head node:
qsub -l select=1:model=has:mpiprocs=8+199:model=has:mpiprocs=24
uc5) Multi-chunk job that picks a special onesy node for the last node of the job, and a different type for all other nodes:
qsub -l select=299:model=ivy+1:model=fasthas
Some observations:
a) In general there will be several model types that a user considers “cheaper” as used in uc3, so a node_filter for that particular chunk could express the acceptable set
b) For a request like uc4 we expect that users are concerned more with overall and per-core memory rather than the particular model.
c) We have thought about trying to reformulate select statements to more exactly describe the MPI/mem ratio needed and have PBS parcel out ranks to match. This would potentially recast uc3 to something like:
qsub -l select=1:model=bro:mpiprocs=4+1600:ncpus=1:mpiprocs=1:gigs_per_core=2gb
and PBS could find nodes that match, with possible contraints like pbs.same mentioned earlier in the discussion.