I have a node with 8 GPUS. PBS is using cgroups to pull the list of available gpus.
I’m trying to find a way to restrict pbs from making one of the GPUs available.
Back in the old days I could create a consumable resource with a list of the gpus to hand out.
What is the best practice to limit access?
We are looking to use MIG configs on some GPUs and leave others working normally and would like to create queues that point to specific GPU configurations, so that we can provide a variety of memory setups for experimentation.
Currently PBS sees all of the GPUs of all types and just hands them out. All GPUs are on a single machine.