I have created a GPU queue called “qgpu_1day” and the cluster have two GPU nodes and 4 v100 cards on each GPU with 48 CPU cores.
I want to modify the GPU queue so that users should use GPU nodes only by mentioning no. of GPUs on the job script mandatorily.
If the job script doesn’t have a declaration of GPUs then the job shouldn’t submit on the cluster. Only GPU jobs should use GPU queue.
Hi Adarsh,
I’ve created a resource called ngpus and assigned this to the GPU queue.
User jobs are running without declaring GPUs on the job script. But it shouldn’t run
qmgr -c " create resource nodetype type=string_array,flag=h"
Add nodetype to the $PBS_HOME/sched_priv/sched_config's resources: line
resources: "ncpus.......,aoe, nodetype"
kill -HUP <PID of the pbs_sched>
qmgr -c "create queue gpuq queue_type=e,started=t,enabled=t"
qmgr -c "set queue gpuq default_chunk.nodetype=gpunode"
qmgr -c "create queue cpuonlyq queue_type=e,started=t,enabled=t"
qmgr -c "set queue cpuonlyq default_chunk.nodetype=cpunode"
for i in list of gpunode ;do qmgr -c "set node $i resources_available.nodetype=gpunode"; done
for i in list of cpu only nodes ;do qmgr -c "set node $i resources_available.nodetype=cpunode"; done
Also, you can have a workq with acl user list and this queue can be assigned to go on any nodes
qmgr -c "set queue workq default_chunk.nodetype=all"
for i in all the nodes in the cluster;do qmgr -c "set node $i resources_available.nodetype=all" ; done
Hope this helps