Please check the firewall is not blocking the ports (15002 / 15003 )
Please check pbs mom services are running
Please add the node with the hostname of the node ( ssh into the compute node, type hostname , this name should be used in the qmgr -c " create node HOSTNAME-OF-THE-NODE" )
Check SELinux is disabled (and system is rebooted after disabling SELinux)
Thanks for the pointers but I still notice the same issue after simplifying my setup by using just /etc/hosts file. There is no firewall or SELinux and mom is up:
/etc/pbs.conf on the mom node looks like this:
PBS_EXEC=/opt/pbs
PBS_HOME=/var/spool/pbs
PBS_START_SERVER=0
PBS_START_SCHED=0
PBS_MOM_HOME=/var/spool/pbs
PBS_START_MOM=1
PBS_START_COMM=1
PBS_COMM_THREADS=4
PBS_SERVER=clmgmt-01
PBS_SCP=/usr/bin/scp
PBS_CORE_LIMIT=unlimited
~
And /etc/pbs.conf on the server node looks like this:
PBS_SERVER=clmgmt-01
PBS_START_SERVER=1
PBS_START_SCHED=1
PBS_START_COMM=1
PBS_START_MOM=0
PBS_EXEC=/opt/pbs
PBS_HOME=/var/spool/pbs
PBS_CORE_LIMIT=unlimited
PBS_SCP=/bin/scp
PBS_LOCALLOG=1
PBS_SYSLOG=2
PBS_SYSLOGSEVR=7
~
Do you see something I might be missing? Or does PBS have any probing tools that might help identify the issue?