Configure OpenMPI error with OpenPBS

OpenMPI installation error

[root@cn01 ~]# yum install openmpi openmpi-devel
Last metadata expiration check: 18:04:39 ago on Mon 12 Apr 2021 08:24:47 PM CST.
Error:
Problem: cannot install both hwloc-libs-2.2.0-1.el8.x86_64 and hwloc-libs-1.11.9-3.el8.x86_64

  • package openmpi-4.0.5-3.el8.x86_64 requires libhwloc.so.15()(64bit), but none of the providers can be installed
  • package openpbs-execution-20.0.1-0.x86_64 requires libhwloc.so.5()(64bit), but none of the providers can be installed
  • cannot install the best candidate for the job
  • problem with installed package openpbs-execution-20.0.1-0.x86_64
    (try to add ‘–allowerasing’ to command line to replace conflicting packages or ‘–skip-broken’ to skip uninstallable packages or ‘–nobest’ to use not only best candidate packages)

Is there a solution for this?

Hi

I think you need to download MPI and build it from source and link in the PBS files. I have a script that I used when I compiled MPI for use with PBS version 20 in December 2020.
I do not seem to be able to attach it so its copied in.
Mike

#!/bin/bash

# This script builds OpenMPI from source. 
# Use this directly or copy it to an OpenMPI source directory and edit to suit.
#
# MPI downloaded from https://www.open-mpi.org openmpi-4.0.4.tar.gz on 3 August 2020.
#
# Usage: ./build_openmpi [clean] 
#
# Author: Mike lake
# Version: 2020.12.01

# Location of our OpenMPI source 
source="/home/XXX/src/openmpi4/openmpi-4.0.4"

# Location of where we wish to install the build. 
target="/shared/opt/openmpi-4.0.4"

# Logfile for the build.
log=openmpi.log

##############
# Start Script
##############

mkdir -p $target
cd $source 

cat /dev/null > $log

# A distclean takes a while so only do it if we 
# specify "clean" as an arg to this script.
if [[ $# -eq 1 && $1 == 'clean' ]]; then
    echo "Cleaning" >> $log
    make distclean
fi

# We need to be on an execution node. This include file is only provided by the
# PBS execution package and not by the client package. 
if [ ! -e /opt/pbs/include/tm.h ]; then
    echo "Error: /opt/pbs/include/tm.h not found. Are you on an execution node?" >> $log
    echo "Exiting." >> $log
    exit
fi
export CFLAGS="-I/opt/pbs/include"

# If we get to here we are probably on an execution node. OK. 

# The directory for the PBS libs is opt/pbs/lib/ and we wish to link in the library 
# libpbs.so (-l prepends "lib" and searches for libpbs.so or libpbs.a). 
export LD_LIBRARY_PATH=/opt/pbs/lib:$LD_LIBRARY_PATH 
export LDFLAGS="-L/opt/pbs/lib -lpbs -lpthread -lcrypto" 

# At the end of configure you can check the configure.log
echo "====== CONFIGURE ====== " >> $log

# Use this config to build for your standard MPI.
./configure --without-slurm --with-tm=/opt/pbs --enable-mpi-interface-warning --enable-shared --enable-static --enable-cxx-exceptions --prefix=$target | tee $log

echo "====== MAKE ====== " >> $log
make -j8

echo "====== INSTALL ====== " >> $log
sudo make install

cd ..

echo "The OpenMPI build has finished." 
echo "Read the $log for any errors."
1 Like

Thank you for your answer I got the following error after running your script.
Error: /opt/pbs/include/tm.h not found. Are you on an execution node?
Exiting.

You need to install the openpbs-devel package to get tm.h.

mkaro is correct:

$ dnf list installed | grep pbs
 pbspro-devel.x86_64      2020.1.2.20210122161823-0.el8
$ rpmquery -ql pbspro-devel.x86_64 | grep tm.h
/opt/pbs/include/tm.h

You can install the devel package (perhaps on just the login node?). The “make install” installs to a shared NFS drive across all the exec nodes. Sorry if I have anything wrong in my script which misleads you.

Mike

Thank you for your answer,Is there any way I can test if the installation is complete?

Hi

#!/bin/bash
# Just tests MPI using python  --version
#PBS -N Test
#PBS -l walltime=00:05:00 
#PBS -l select=3:mpiprocs=2:mem=1GB
#PBS -l place=scatter
    
export PATH="/shared/opt/openmpi-4.0.4/bin:$PATH"
export LD_LIBRARY_PATH="/shared/opt/openmpi-4.0.4/lib"

cd ${PBS_O_WORKDIR}
mpiexec /usr/bin/python3  --version 
cat $PBS_NODEFILE

When you use qsub to submit the job the output will be like this:

$ cat Test.o138900
Python 3.6.8
Python 3.6.8
Python 3.6.8
Python 3.6.8
Python 3.6.8
Python 3.6.8
hpcnode03
hpcnode03
hpcnode04
hpcnode04
hpcnode06
hpcnode06

It ran two MPI instances on on 3 nodes

Thank you all for your help, I have completed the build.