How can we guarantee that a group of users (or a project) a minimum of X nodes available

As a working example, let us consider we have 100 nodes and 2 groups for which we should guarantee 10 nodes each.

How could this be implemented in PBS?

Case 1:
1 . Create two queues with ACLS
2. Qlist + 10 nodes to each of the queues
3. Make sure other 80 nodes are not part of this Qlist

Case 2:

  1. Create two reservation queues each with 10 nodes assigned to it and enable ACL on those queues
  2. rest the nodes can be utilised by users in the group(s)

Case 3:

  1. write hook(s) that keep track of mapping of - nodes and group of users and their jobs

Hope these cases help. There might be other custom solutions as well

Hi Adarsh,
thanks for reply
please consider one scenarios so how can we do

In the example scenario, we have:

  • 2 specific user groups and the remaining mass of other users
  • The cluster size is 100 nodes
  • The 2 groups must have a guaranteed minimum capacity of 10 nodes
  • The job size of each job is 6 nodes
  • All users have constantly jobs waiting to be placed by the scheduler into the cluster
  • With the given job size of 6 nodes we have:
    • The pool for all users (whether in a specific group or not) is 80 nodes
      • 100 [nodes total] – 2 [number of groups] * 10 [nodes for each specific group] = 80 nodes
    • a maximum of 13 jobs (job size 6 nodes) could be placed for users whether in a specific group or not:
      • 13 * 6 = 78 nodes
    • With the guaranteed 10 nodes, group A and B can always run a minimum of 1 job
    • This results in a remaining capacity for group A or B of 6 nodes, suitable to place 1 further job
      • 2 nodes from the pool for all and 4 nodes from the guaranteed pool, could support a further 6 node job
        • ( 80 nodes in total for all users; maximum utilization for all users is 78 nodes (see above), so 80 -78 = 2 nodes)
        • 10 nodes guaranteed for each special group; 1 node fits completely in this limit => 10 -6 = 4 nodes
      • => 1 additional job, whether from group A or from group B, could run in the cluster
        • In this scenario, 4 nodes would not be utilized, as these are in the pool of one of the special groups.

What kind of configuration could ensure a utilsataion of 96 nodes in this scenario ?

regards
deepak