Regarding hard binding of resources

Please help me on below issue. (attached file of configuration)

Point 1 - we have 7 old compute nodes with 24 cores each, and OpenPBS version 14.2 is installed. We have created 4 queues and for two groups that’s agarwal and indranil, for agarwal group we already configured 96 cores and for indranil group 72 cores configured, in pbs server level.

Point 2 - Now we added 8 new compute nodes on cluster with 32 cores on each node, that’s only related to agarwal group and we created new 3 queues for agarwal group.

My question is,

a) How to configure in pbs that agarwal group users submit job in new queues, and it should goes to new nodes only. (without binding the nodes to queue is possible? if node binding is necessary in this condition then how to bind all new nodes to all the new queues?)

b) How to configure in pbs that agarwal group users submit job in old queue with till 96 cores only? it should go to on old nodes because agarwal group limit is 96 cores as per old policy.

i.e How to set hard limit in 4 old queues that agarwal group users only use 96 cores only?

Output of qmgr ‘p s’ for reference.

Create queues and set their attributes.

Create and define queue mini

create queue mini
set queue mini queue_type = Execution
set queue mini resources_max.ncpus = 48
set queue mini resources_max.walltime = 24:00:00
set queue mini resources_min.ncpus = 1
set queue mini acl_group_enable = True
set queue mini acl_groups = Agarwal
set queue mini acl_groups += Indranil
set queue mini enabled = True
set queue mini started = True

Create and define queue long

create queue long
set queue long queue_type = Execution
set queue long resources_max.ncpus = 24
set queue long resources_max.walltime = 120:00:00
set queue long resources_min.ncpus = 1
set queue long acl_group_enable = True
set queue long acl_groups = Agarwal
set queue long acl_groups += Indranil
set queue long enabled = True
set queue long started = True

Create and define queue extralong

create queue extralong
set queue extralong queue_type = Execution
set queue extralong resources_max.ncpus = 12
set queue extralong resources_max.walltime = 336:00:00
set queue extralong resources_min.ncpus = 1
set queue extralong acl_group_enable = True
set queue extralong acl_groups = Agarwal
set queue extralong acl_groups += Indranil
set queue extralong max_run = [u:PBS_GENERIC=1]
set queue extralong enabled = True
set queue extralong started = True

Create and define queue short

create queue short
set queue short queue_type = Execution
set queue short resources_max.ncpus = 96
set queue short resources_max.walltime = 72:00:00
set queue short resources_min.ncpus = 1
set queue short acl_group_enable = True
set queue short acl_groups = Agarwal
set queue short acl_groups += Indranil
set queue short enabled = True
set queue short started = True

Create and define queue smalln

create queue smalln
set queue smalln queue_type = Execution
set queue smalln resources_max.ncpus = 128
set queue smalln resources_max.walltime = 48:00:00
set queue smalln resources_min.ncpus = 1
set queue smalln acl_group_enable = True
set queue smalln acl_groups = Agarwal
set queue smalln enabled = True
set queue smalln started = True

Create and define queue longn

create queue longn
set queue longn queue_type = Execution
set queue longn resources_max.ncpus = 64
set queue longn resources_max.walltime = 96:00:00
set queue longn resources_min.ncpus = 1
set queue longn acl_group_enable = True
set queue longn acl_groups = Agarwal
set queue longn enabled = True
set queue longn started = True

Create and define queue extralongn

create queue extralongn
set queue extralongn queue_type = Execution
set queue extralongn resources_max.ncpus = 64
set queue extralongn resources_max.walltime = 144:00:00
set queue extralongn resources_min.ncpus = 1
set queue extralongn acl_group_enable = True
set queue extralongn acl_groups = Agarwal
set queue extralongn enabled = True
set queue extralongn started = True

Set server attributes.

set server scheduling = True
set server max_run_res.ncpus = [g:Agarwal=96]
set server max_run_res.ncpus += [g:Indranil=72]
set server max_run_res.ncpus += [u:PBS_GENERIC=96]
set server log_events = 511
set server mail_from = adm
set server query_other_jobs = True
set server resources_default.ncpus = 1
set server default_chunk.ncpus = 1
set server scheduler_iteration = 600
set server resv_enable = True
set server node_fail_requeue = 310
set server max_array_size = 10000
set server pbs_license_min = 0
set server pbs_license_max = 2147483647
set server pbs_license_linger_time = 31536000
set server license_count = Avail_Global:1000000 Avail_Local:1000000 Used:0 High_Use:0 Avail_Sockets:1000000 Unused_Sockets:1000000
set server eligible_time_enable = False
set server max_concurrent_provision = 5

* qmgr -c "create resource group_name type=string_array,type=h"
* qmgr -c "set queue QUEUENAME default_chunk.group_name=agarwal"
* edit $PBS_HOME/sched_priv/sched_config 
     -  find the resources: "ncpus, aoe.......,group_name"   , append group_name to the end of the line
     -  kill -HUP < PID of the pbs_sched process >
* for i in "list of agarwal nodes"; do qmgr -c "set node $i  resources_available.group_name=agarwal";done

Same as above, but use different group_name assigned to the default_chunk of the queue.
And nodes should be appended with that group_name
qmgr -c “set node NODENAME resources_available+=agarwalold” # note += in this line to append

queue limits can be used - please follow this section from the PBS Pro admin guide - Examples of Setting Server and Queue Limits

Hi Adarsh,
What you have suggested command is not working.

[root@gananamaster ~]# qmgr -c “create resource group_name type=string_array,type=h”
qmgr obj=group_name svr=default: Illegal attribute or resource value
qmgr: Error (15014) returned from server

Sorry, correction
qmgr -c "create resource group_name type=string_array,flag=h"

Dear Adarsh,

Thank you for your reply.

I have checked the pbspro admin guide and I found there single queue level limit can be apply only.

But I need to bind my old 4 queues (mini,short,extralong,long) should have combine limit at
group or user level.

In old 4 queues (mini,short,extralong,long) combine queue limit for agarwal group should be 96 core.

Please let me know in case any method apply the same.

You are welcome !

You can set the limit at the server level for users:

Generic Users = 4
pbsuser01 = 2
pbsuser02 = 6
qmgr: set server max_run +="[u:PBS_GENERIC=4], [u:pbsuser01=2],[u:pbsuser02=6]"

Limit agarwal group = 6
qmgr: set server max_run = “[g:agarwal=6]”

Please check the server and queue limits chapter in the PS Pro admin guide.
Hope this is helpful.

Dear Adarsh,

Please help me on belwo issue.

We have 15 compute nodes.
We have configured 7 queue small, long, extralong, mini, smalln, longn and extralongn.
smalln, longn, extralongn queues are binding with 8 nodes. When you submit job in smalln, longn, extralong queue job goes to binding nodes.

My question is

qstat command not showing jobs running time in smalln, longn, extralongn queue other
queue is showing perfectly job running time.

Could you please try these commands
qstat -answ1
qstat -T
qstat -fx

Dear Adarsh,

Thank you for your reply.

qstat -answ1
qstat -T
qstat -fx

all command working fine but only qstat is not showing, do I need to do any configuration
that qstat will show the running time of jobs.

Thank you @narayan

Could you please share the output of the above commands with obfuscation?
Do all jobs request walltime ?