GPU jobs to run

Hello All,

I have created a GPU queue called “qgpu_1day” and the cluster have two GPU nodes and 4 v100 cards on each GPU with 48 CPU cores.
I want to modify the GPU queue so that users should use GPU nodes only by mentioning no. of GPUs on the job script mandatorily.
If the job script doesn’t have a declaration of GPUs then the job shouldn’t submit on the cluster. Only GPU jobs should use GPU queue.

There are many ways to achieve this:

  1. Bind Multiple queues on Single node at HPC cluster
  2. Queuejob hook will check the qsub select statement and assign the respective queue to the job
  3. routing queues

Hi Adarsh,
I’ve created a resource called ngpus and assigned this to the GPU queue.
User jobs are running without declaring GPUs on the job script. But it shouldn’t run

qmgr -c " create resource nodetype type=string_array,flag=h"

Add  nodetype to the  $PBS_HOME/sched_priv/sched_config's  resources: line
resources: "ncpus.......,aoe, nodetype"

kill -HUP <PID of the pbs_sched>

qmgr -c "create queue gpuq  queue_type=e,started=t,enabled=t"
qmgr -c "set queue gpuq  default_chunk.nodetype=gpunode"

qmgr -c "create queue cpuonlyq  queue_type=e,started=t,enabled=t"
qmgr -c "set queue cpuonlyq  default_chunk.nodetype=cpunode"


for i in list of gpunode ;do  qmgr -c "set node $i resources_available.nodetype=gpunode"; done
for i in list of cpu only nodes ;do  qmgr -c "set node $i resources_available.nodetype=cpunode"; done

Also, you can have a workq with acl user list and this queue can be assigned to go on any nodes

qmgr -c "set queue workq  default_chunk.nodetype=all"
for i in all the nodes in the cluster;do qmgr -c "set node $i resources_available.nodetype=all" ; done

Hope this helps

For a detailed description of how to route jobs, see section 4.9.39 on p. AG-207 in the PBS Professional Administrator’s Guide.