Hi OpenPBS community team, I tried to install this program to manage the workload. my interest to use OpenPBS to run Numerical model simulations of WaveWatchIII. I have a single node 20 processor with GPU CUDA RX5000 configuration and OS ubuntu 20.04.1 LTS machine. So, I would like to install and run all PBS components on a single machine (Page no.13 - single execution system referred from PBS professional 2021.1 installations & Upgrade Guide). I did two trials to install but not successfully installed. First trial with Openpbs source code tar.gz file from new release v20.0.1, I have followed the instruction given in the INSTALL file but some errors(111) is showing, and the program was not running properly. The second one openpbs_20.0.1.ubuntu_1804.zip tried to install but while pbs start while i am giving terminal automatically terminated and another error showing(server .deb files are not installing properly and showing error / dpkg-deb : error: paste subprocess was killded by signal (Brocken pipe). So, Kindly help me with how to install the program
Hi , I did the below on a new installation of ubuntu 20.04.2 LTS and it worked without any issues
As root user:
-
set the static ip address and hostname in /etc/hosts
-
create a user account called “pbsdata”
-
wget https://github.com/openpbs/openpbs/archive/refs/heads/master.zip
-
unzip master.zip
-
cd openpbs-master/
-
apt install gcc make libtool libhwloc-dev libx11-dev libxt-dev libedit-dev libical-dev libncurses-dev perl postgresql-server-dev-all postgresql-contrib python3-dev tcl-dev tk-dev swig libexpat1-dev libssl-dev libxext-dev libxft-dev autoconf automake build-essential openssh-server net-tools
-
apt install expat libedit2 postgresql python3 postgresql-contrib sendmail-bin sudo tcl tk libical3 postgresql-server-dev-all
-
./autogen.sh
-
./configure --prefix=/opt/pbs
-
make ; make install
-
/opt/pbs/libexec/pbs_postinstall
-
edit /etc/pbs.conf and set PBS_START_MOM=1
-
chmod 4755 /opt/pbs/sbin/pbs_iff /opt/pbs/sbin/pbs_rcp
-
/etc/init.d/pbs start
-
. /etc/profile.d/pbs.sh
-
ps -ef | grep pbs_
-
qstat -Bf
-
pbsnodes -av ; pbsnodes -aSjv
Hope this helps
Thank you, Adarsh I have installed it successfully. the procedure you mentioned is perfect. I will start working on my model. if I faced any problem kindly support give the suggestions form OpenPBS community team.
Thank you
Bill Nitzberg
OpenPBS Community Manager
@adarsh Hi adarsh, with the above steps I was able to install head node.
How to go ahead with the compute nodes.
Using Ubuntu 20.04 . I tried with the execution deb file from other version - https://vcdn.altair.com/rl/OpenPBS/openpbs_22.05.11.ubuntu_20.04.zip
But after adding nodes,
it stays in unknown state. not able to figure it out.
- check with firewall / selinux disabled and system rebooted (if you disable selinux)
- DNS / hosts file is all good
- check pbs_mom service is up and running on the compute node ( ps -ef | grep pbs_ )
- check the mom logs
- check the server logs
- check ports 15001 to 15009 , 17001 are not blocked (between pbs server and pbs mom )
Hi Adarsh,
Just by adding info in /etc/hosts it was not working. I created a local DNS in head node and now PBS seems to be working fine.
Thanks a lot for your suggestions.
Regards,
Vinay
Hi Adarsh,
I setup using your all blogs, can I know how to create multiple pbs job run uses
also I want to know how to added nodes to master head node
As root user on the headnode:
To add a node
qmgr -c “create node NODENAME”
#NODENAME is the hostname of the compute node
To delete a node
qmgr -c “delete node NODENAME”
#NODENAME is the hostname of the compute node
How to run jobs ? – As standard linux user
qsub -l select=1:ncpus=2:mem=1gb – /bin/sleep 1000 # to run on one node
qsub -l select=2:ncpus=2:mem=1gb -l place=scatter – /bin/sleep 1000 # to run on two nodes
Hi Adarsh
I am also following your all the steps, it is working for master node when I am adding one compute node in master showing state is down could you pls help this is my testing machine.
Please check
- pbs_mom process is up and running ( ps -ef | grep pbs_ )
- ports , firewall are not blocked
- selinux disabled and system rebooted
- hostname resolution (/etc/hosts)
Hi Adarsh,
Everything is working right now, I am able to submit the cpu based job but How can I configure and run gpu based test job or docker in Open PBS 23 version.
Pls guide for GPU based configuration. If possible pls ping your linkdin id.
Nice one !
For GPU configuration / scheduling please refer this
Please refer:
#######
Ref: 5.14.7.3 Basic GPU Scheduling