I’m testing PBS Pro Open Source version on CentOS 6. It works fine
According to a Altair’s pdf “Scheduling Jobs onto NVIDIA Tesla GPU Computing Processors using PBS Professional” which was written in October 2010, it says “Note also that PBS Professional allocates GPUs, but doesn’t bind jobs to any particular GPU; the application itself (or the CUDA library) is responsible for the actual binding.”
Is this behavior (about actual binding jobs to a particular GPU) same on the latest PBS Pro Open Source version? In case of multiple GPUs in a PC, how can we get exclusive use of GPU among multiple jobs ? Is there way to automatically set CUDA_VISIBLE_DEVICES by PBS ?
Thanks for trying out PBS Professional. In the 13.0 commercial release we provided the ability to set the CUDA_VISIBLE_DEVICES variable in our limited availability cgroups hook and pbs_attach command. I am currently working on contributing the cgroups hook to the open source community. If all goes well it will be available in the master branch by the end of the month. This should provide you with the ability to set CUDA_VISIBLE_DEVICES on single node jobs. However, due to how it was implemented we decided not to release the updated pbs_attach command to the community until we can rework it to conform to our architectural standards. We hope to address the issues with pbs_attach in our next major release of PBS Professional in Q1 of next year. The ticket for this work is PP-278 if you are interested in watching or working on the ticket.
Note that even without a pbs_attach rework you can still use CUDA integration: either if you don’t use multihost jobs or if you use pbs_tmrsh rather than pbs_attach. It’s only when you spawn remote processes directly using ssh that you need pbs_attach (but some libraries e.g. Intel MPI let you use your own remote process launcher that follows rsh semantics, and then you can use pbs_tmrsh and let MoM spawn remote processes directly).
If course this doesn’t change the fact you need more than just Open Source PBSPro; you still need some type of hook to manage which resources are bound to which job.
OTOH, it’s possible to use something simpler than the cgroup hook (if you roll your own); you can just create extra vnodes in v2 configuration files that own specific GPUs and then parse the job’s exec_vnode to find out what devices to use.
But if you have more than one socket and GPUs on both sockets, you’ll want processes contained on particular cores and then if you want it all to “automagically” work you’ll indeed need something just as complex as the cgroup hooks, unless you are prepared to also use numactl in /etc/profile.d scripts or your job scripts to figure out what cpus and memory to use from the GPUs assigned to the job (which is not that hard in practice; it’s a lot easier to write a set of hooks that works for your particular use case than to write the general cgroup hook with NUMA and devices support.)