PBS daemons are running but connection refused

Hi,

I installed the OpenPBS on my Centos 8 and did the following additional things.

  • Disabled SELinux and firewall
  • Access SSH to pbshost without password
  • Add proper address to /etc/hosts file as
    192.168.0.229 pbshost

And I can see that all the PBS daemons are running as following

pbs_server is pid 29477
pbs_mom is pid 2531
pbs_sched is pid 2545
pbs_comm is 2520

But when I run the qmgr command, it refuses the connection.

Connection refused
qmgr: cannot connect to server

I have no idea how to deal with this problem.
I need your help.

Thank you and have a nice day.

Geon-Hong

  1. After disabling the SELinux did you reboot the system ?
  2. Please also run these command
    systemctl stop firewalld
    systemctl mask firewalld
  3. Please check these ports are accessible
    15001-15009 and 17001
  4. you can install the strace on the system and run
    strace qmgr # this might give lot of information and finally might tell you where it got stuck

Please open two terminals
Terminal1 as root user, run the : qmgr command
Terminal2 as root user : source /etc/pbs.conf ; tail -f $PBS_HOME/server_logs/YYYYDDMM
You can capture the error that is captured and find out the reason

  1. Please share the output of
    cat /etc/pbs.conf
    cat /etc/hosts

Hope this helps

1 Like

Hi adarsh,

I read your previous posts and followed the instructions. It really helped me to resolve this problem.

  1. After disabling the SELinux did you reboot the system ?

Yes. But it didn’t work.

  1. Please also run these command
    systemctl stop firewalld

Done.

systemctl mask firewalld

Done.

  1. Please check these ports are accessible
    15001-15009 and 17001

How can I open those ports without using the firewalld, since the firewalld is disabled already?
How can I get to know if the port is accessible or not? (I am using nc or netstat for checking but is it correct way to check?)

  1. you can install the strace on the system and run
    strace qmgr # this might give lot of information and finally might tell you where it got stuck

I ran the strace qmgr but I cannot catch the reason of the problem.

========

I found something was wrong with my postgresql thing so I installed the postgresql-server again. So the error regarding pg_ctl thing has been resolved but still has some problem.

04/09/2021 15:51:41;0d80;Server@pbshost;TPP;Server@pbshost(Main Thread);Initializing TPP transport Layer
04/09/2021 15:51:41;0d80;Server@pbshost;TPP;Server@pbshost(Main Thread);Max files allowed = 16384
04/09/2021 15:51:41;0d80;Server@pbshost;TPP;Server@pbshost(Main Thread);TPP initialization done
04/09/2021 15:51:41;0d80;Server@pbshost;TPP;Server@pbshost(Main Thread);Connecting to pbs_comm pbshost:17001
04/09/2021 15:51:41;0c06;Server@pbshost;TPP;Server@pbshost(Thread 0);Thread ready
04/09/2021 15:51:41;0c06;Server@pbshost;TPP;Server@pbshost(Thread 0);Registering address 192.168.0.229:15001 to pbs_comm pbshost:17001
04/09/2021 15:51:41;0c06;Server@pbshost;TPP;Server@pbshost(Thread 0);Registering address 192.168.122.1:15001 to pbs_comm pbshost:17001
04/09/2021 15:51:41;0c06;Server@pbshost;TPP;Server@pbshost(Thread 0);Connected to pbs_comm pbshost:17001
04/09/2021 15:51:41;0002;Server@pbshost;n/a;setup_env;read environment from /var/spool/pbs/pbs_environment
04/09/2021 15:51:41;0000;Server@pbshost;Svr;Server@pbshost;Supported authentication method: resvport
04/09/2021 15:51:41;0002;Server@pbshost;Svr;Server@pbshost;Stopping PBS dataservice

This is the tail of my server log. I’m not sure why the PBS dataservice was stopped.

Here’s my pbs.conf and hosts files.
/etc/pbs.conf

PBS_SERVER=pbshost
PBS_START_SERVER=1
PBS_START_SCHED=1
PBS_START_COMM=1
PBS_START_MOM=1
PBS_EXEC=/opt/pbs
PBS_HOME=/var/spool/pbs
PBS_CORE_LIMIT=unlimited
PBS_SCP=/bin/scp

/etc/hosts

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.0.229 pbshost

Thank you.

Once the firewalld services are stopped and disabled , then the ports should be opened.
You can try to check if port are open or not using telnet command
telnet hostname

example for open connection: 
root@pbspro:~# telnet pbspro 6200
Trying 204.235.30.17...
Connected to pbspro.
Escape character is '^]'.
<ctrl + ] > to quit

example of closed port:
root@pbspro:~# telnet pbspro 6201
Trying 192.168.30.17...
telnet: connect to address 92.168.30.17: Connection refused

The logs shared does not tell us about any issues. If there is an issue, please try to un-install PBS Pro and install it again .