I’d like to do a partial oversubscription with MPMD, like this
Request 40 trunks in total
First group of 32 trunks with 128 cores per trunk, 1 MPI rank per core, 1 OMP threads per MPI rank
Second group of 8 trunks with 128 cores per trunk, 64 MPI rank per core, 1 OMP threads per MPI rank
Tried to launch MPMD with these combinations
1. mpiexec -n 4096 A : -n 512 --npernode 64 B
2. mpiexec -n 4096 A : -n 512 --map-by ppr:64:node B
neither did it work.
Looked like OpenMPI just launch B with 128 MPI processes per core instead of 64. Wondering if it’s due to the lists of nodes in PBS_NODEFILE, as it will get a lists of 128x40 nodes in the PBS_NODEFILE if using select=40:ncpus=128:mpiprocs=128.
Just wondering if there was an alternative approach, such as using placement paraments or by using PBS select directives to get expected PBS_NODEFILE, other then using a pre-launch wrapper script to amend the PBS_NODEFILE.