PBS Single exection host run job using cpu include gpu

rajesh70530 · May 3, 2021, 9:16am

Hi
I am running a job using a single execution host, where 20 processors and gpu Nvidia RTX5000 CUDA and on 32 gb ram machine. When i use PBS -l select=1:ncpus=7:mpiprocs=7:mem=14GB,ngpus=1, error unknown resource ngpus is showing. job running on cpu processers only, but in want to use both cpu and gpu. I need to increase the processing speed by assigning cpu and gpu both together to run the job. when I check in qstat -Bf there in the resource ngpus has not appeared. how to use gpu for numerical model simulation.

adarsh · May 3, 2021, 8:07pm

Please create a ngpus host-level resource and configure the node as below

qmgr -c “create resource ngpus type=long,flag=nh”
Add ngpus to the resources: line of the $PBS_HOME/sched_priv/sched_config file
eg: resources: “ncpus, aoe, …,ngpus”
kill -HUP
qmgr -c “set node NODENAME resources_available.ngpus=1”
NODENAME = replace with your compute node hostname
Here, i have assigned it to 1 , it means one gpu card, if you have more assign accordingly

Please note defining and requesting resources via qsub in PBS Pro helps scheduling of jobs on to the compute nodes ( requests – matchmaking – on to available resources at that time) . PBS Pro does not enforce the underlying applications to use 1 ncpu and 1 ngpu. The application that you use, should be capable of utilizing both cpu(s) and gpu(s). The request statement via qsub is to help PBS Pro what kind of resource your job requires to run and PBS Pro searches such resources in your cluster and schedules job on to them.

Note:

qsub -l select=1:ncpus=1 - - /path/to/myapplication/runprogram -np 1 -ngpus 1 -input inputfile.fem
Here i have not requested the gpu, but in the application batch command line, i have asked the application to use the GPU. PBS Pro would still run this job and the application can utilise gpu(s).
Here PBS Pro does not know whether this job has occupied the gpu on that node.

qsub -l select=1:ncpus=1:ngpus=1 - - /path/to/myapplication/runprogram -np 1 -ngpus 1 -input inputfile.fem
Here PBS Pro knows , that this request running on the node is using that gpu, so any further request of this gpu will put the job in the queue, until this job is finished.

rajesh70530 · May 6, 2021, 11:36am

when i tried to ngpus host-level resource as you mentioned error 15007 showing
$qmgr -c ‘create resource ngpus type=long,flag=nh’
qmgr obj=ngpus svr=default: Unauthorized Request
qmgr: Error (15007) returned from server

adarsh · May 6, 2021, 7:19pm

Please try the above commands as root user, also please type it in, copy paste adds some special characters for double quotes " or single quote ’

rajesh70530 · May 7, 2021, 6:03am

in root # ./opt/pbs/bin/qmgr -c ‘create resource ngpus type=long,flag=h’
qmgr obj=ngpus svr=default: Duplicate entry in list
qmgr: Error (15055) returned from server
quotes i given manually. still the error was showing
/# ./opt/pbs/bin/qmgr -c ‘set node workstation resources_available.ngpus=1’
qmgr obj=workstation svr=default: Unknown node
qmgr: Error (15062) returned from server

adarsh · May 8, 2021, 9:38pm

ngpus custom resource already added to the PBS Server.
qmgr: print resource ngpus # should give you the result

The compute node should be part of /etc/hosts of PBS Server and itself.
It should be resolvable and should have a static IP address and a resolvable hostname (reverse resolvable as well).

Topic		Replies	Views
Using gpus or mics on PBSPro Users/Site Administrators	3	2236	December 1, 2016
How to a CPU queue and GPU queue and hoe to utilising it Developers	4	2549	July 20, 2018
How to configure GPU resource within PBSPro Users/Site Administrators	13	11086	January 7, 2020
Qsub: Run GPU and CPU job simultaneously Developers	2	2085	September 7, 2017
Running cpu and gpu jobs concurrently Users/Site Administrators	11	3536	July 26, 2018

PBS Single exection host run job using cpu include gpu

Related topics