I’m trying to implement Multi-Instance GPU in a HPC cluster in order to increase its throughput. Let’s say I use MIG to create 7 GPU instances (1g.10gb, NVIDIA A100 80GB) and try to run 8 jobs using GPUs. Does OpenPBS automatically manage which job is going in which GPU instance (without the need of especifying the UUID) and places the 8th job in queue?
Also, is there the need of any additional configuration to be able to use MIG?
Please refer:
https://openpbs.atlassian.net/wiki/spaces/PD/pages/2313453569/Nvidia+MIG+Support
Yes it would do the scheduling as stated in your query.
References above should help
Refer the guide: https://help.altair.com/2024.1.0/PBS%20Professional/PBS2024.1.pdf and
Nvidia MIG Support