I am running the WRF-4.2 weather model using openmpi-3.1.2 and pbs_version = 19.1.1, and I’d like to test my 10gBE interconnect. I had been relying on the backward compatibility of using (as an example of what worked) -l nodes=1:ppn=8. Now I can select nodes connected to the 10gBE switch using -l select=1:ncpus=8:switch=10gBE.
However, I find that I now must supply mpiprocs in my select statement. I’m not sure if that is required only when I use all the CPU cores on a node, or not. This worked:
These two caused PBS to start the job, but WRF failed:
-l select=2:ncpus=32:mpiprocs=64:switch=10gBE -l select=4:ncpus=32:mpiprocs=128:switch=10gBE
They both had MPI_COMM_WORLD errors in MPI_Cart_create. I searched this forum, the WRF forum, and https://www.altair.com/pdfs/pbsworks/PBSAdminGuide19.2.3.pdf and found mentions of mpiprocs, but no definition.
What is it used for, and what should I be setting it to? The total number of CPU cores I want to use? The maximum per node? 1 per job (if it means the “mother” thread)?