Submitting job across multiple nodes

I apologize in advance, I am an engineer, not a programmer.

I am trying to run a software program on a system that has 4 nodes with 32 cores each. I am trying to utilize 30 cores on each node. This is a program I am very familiar with, but I have never tried to split a job across nodes. The program I use spits out the .ksh file that I submit to submit a job, but I don’t think it appropriately formats the .ksh file for running on more than one node.

The applicable lines seem to be the two below:
#PBS -l select=120:ncpus=1:mpiprocs=1:mem=83mb
mpirun -np 120 pregmpi 2>>1 >abaqus_ge_pre.log

If I run with these lines, I get the error “There are not enough slots available in the system to satisfy the 120 slots that were requested by the application.”

I tried changing select=4:ncpus=30, but I still get the same error.

Any help would be much appreciated. Thank you!

The job requesting the below, is asking for 4 nodes (chunks) each node with 30 cores or ncpus

#PBS -l select=4:ncpus=30:mpiprocs=30
#PBS -l place=scatter

cd $PBS_O_WORKDIR
np=`cat $PBS_NODEFILE | wc -l `
mpirun -np $np --hostfile $PBS_NODEFILE  pregmpi 2>>1 >abaqus_ge_pre.log

Reference:

If you could share the below ,it would be helpful

  • the version of openPBS
  • the flavour of MPI used (Intel MPI or OpenMPI or MPICH etc)
  • share the output of pbsnodes -av