How to add PBS Pro client on Node - Ubuntu

Hi,

I am new at PBS Pro and thank you in advance for your help.

I run Ubuntu 18.04 server version (no GUI) and I was able to install PBS Pro on the headnode following these instructions.

I also edited /etc/hosts file on both machine (headnode and node)
I also configured passwordless ssh connection between the headnode and the node.

All went well I am able to run PBS Pro command such as
$ qstat -B
Server Max Tot Que Run Hld Wat Trn Ext Status


eric9020-optiple 0 0 0 0 0 0 0 0 Idle

Now, I am trying to install PBS Pro on a client node (where the job will actually run) but I don’t seem to be able to find any information on Ubuntu 18.04. I understand that PBS Pro is very concentrated on CentOS but my cluster (12 nodes) are already been configured with Ubuntu 18.04 and OpenMPI.

I followed exactly the same steps for the headnode except for /etc/pbs.conf
I edited the pbs.conf file for the node as follow:

Node:

PBS_SERVER=eric9020-optiplex
PBS_START_SERVER=0
PBS_START_SCHED=0
PBS_START_COMM=1
PBS_START_MOM=1
PBS_EXEC=/opt/pbs
PBS_HOME=/var/spool/pbs
PBS_CORE_LIMIT=unlimited
PBS_SCP=/usr/bin/scp

For the Headnode I edited the pbs.conf file as follow:

PBS_SERVER=eric9020-optiplex
PBS_START_SERVER=1
PBS_START_SCHED=1
PBS_START_COMM=1
PBS_START_MOM=0
PBS_EXEC=/opt/pbs
PBS_HOME=/var/spool/pbs
PBS_CORE_LIMIT=unlimited
PBS_SCP=/usr/bin/scp

I edited the config for the PBS Pro client node to connect to the headnode (eric9020-optiplex)
$ /var/spool/pbs/mom_priv/config
$clienthost eric9020-optiplex
$restrict_user_maxsysid 999

I created a “nodes” file the on the headnode
/var/spool/pbs/server_priv# cat nodes
knife11.stanford.edu np=2

When I try to add a node with the following command
$ qmgr -c “create node knife11”

I have the following error:
qmgr obj=knife11 svr=default: Unauthorized Request
qmgr: Error (15007) returned from server

Thank you for your help

PBS_START_COMM should be 0

Did you run the above command as root user ?
Is your /etc/hosts correctly populated and DNS working ?

Please run the below command as root user :
qmgr -c “set server managers+=root@eric9020-optiplex”
qmgr -c “create node knife11”

Hope this works

Thank you adarsh.

When i switch to root user and run qstat -B I get the following error message:

Command ‘qstat’ not found, but can be installed with:
apt install gridengine-client
apt install slurm-wlm-torque

which means that i need to set paths to PBS Commands for root user. How do i do that ?

@legio06
You can set PBS binaries path as
‘export PATH=/opt/pbs/bin:/opt/pbs/sbin:$PATH’

@legio06: You can create a file
/etc/profile.d/pbs.sh ( if it is already there then source /etc/profile.d/pbs.sh )

The contents of this file

source /etc/pbs.conf
export PATH=$PATH:$PBS_EXEC/bin

logout and login as root and run the commands or else source /etc/profile.d/pbs.sh and then run the commands

Thank you @kjakkali and @adarsh for the PATH. It works.

I could not run the command
root@eric9020-optiplex:~# qmgr -c “set server managers+=root@eric9020-optiplex”
Unknown Host.
qmgr: cannot connect to server server
Unknown Host.
qmgr: cannot connect to server managers+=root@eric9020-optiplex”

However, i was able to run the following command and add the node (knkife11.stanford.edu)

root@eric9020-optiplex:~# qmgr
Max open servers: 49
Qmgr: create node knife11
Qmgr: quit

root@eric9020-optiplex:~# pbsnodes -a
knife11
Mom = knife11.stanford.edu
Port = 15002
pbs_version = 19.1.3
ntype = PBS
state = free
pcpus = 2
resources_available.arch = linux
resources_available.host = knife11
resources_available.mem = 3907988kb
resources_available.ncpus = 2
resources_available.vnode = knife11
resources_assigned.accelerator_memory = 0kb
resources_assigned.hbmem = 0kb
resources_assigned.mem = 0kb
resources_assigned.naccelerators = 0
resources_assigned.ncpus = 0
resources_assigned.vmem = 0kb
resv_enable = True
sharing = default_shared
last_state_change_time = Wed Dec 4 09:32:13 2019

I was not able to submit a simple query job as suggested in the above instructions from GitHub

$ echo “sleep 60” | qsub

The error message was:
qsub: No default queue specified

Why is that ?

Thank you again for your help.

Best,
Eric