I am following the directions for “Basic GPU Scheduling” in section 5.14.7.1.ii on pg AG-284 of the 2021.1 Admin guide. I did the following sequence:
999 2021-10-05 14:05:04 /opt/pbs/bin/qmgr -c "create resource ngpus type=long, flag=nh"
aps-edge-dev-01:~ # /opt/pbs/bin/qmgr -c "print server" | grep ngpus
# Create and define resource ngpus
create resource ngpus
set resource ngpus type = long
set resource ngpus flag = hn
1007 2021-10-05 14:10:25 service pbs stop
1009 2021-10-05 14:13:38 vi /var/spool/pbs/sched_priv/sched_config
aps-edge-dev-01:~ # cat /var/spool/pbs/sched_priv/sched_config | grep ngpus
resources: "ncpus, mem, arch, host, vnode, aoe, eoe, ngpus"
1010 2021-10-05 14:15:01 service pbs start
aps-edge-dev-01:~ # /opt/pbs/bin/qmgr -c "set node aps-edge-dev-01 ngpus=1"
qmgr obj=aps-edge-dev-01 svr=default: Undefined attribute
qmgr: Error (15002) returned from server
1015 2021-10-05 14:20:20 /opt/pbs/bin/qmgr -c "set node aps-edge-dev-01 comment=testing"
aps-edge-dev-01:~ # pbsnodes -av | grep testing
comment = testing
Any ideas? The resource doesn’t need to be created on the node instead does it? This is the first time I have used a consumable resource. Is there some step I am missing? I know you can do this with cgroups, but we don’t want to use that (yet).
BTW, the PBSPro documentation is inconsistent. Maybe this does not matter, but section 5.14.4.2.i does the set node
before editing the sched_config or restarting the server.