Connection refused qmgr: cannot connect to server

Hello all,

I’m installing openPBS on Rocky Linux 9.2 and I have an issue with the pbs_server not running.

/etc/init.d/pbs status

pbs_server is not running
pbs_sched is pid 8165
pbs_comm is 8150

This is my /etc/pbs.conf:

PBS_EXEC=/opt/pbs
PBS_SERVER=headnode
PBS_START_SERVER=1
PBS_START_SCHED=1
PBS_START_COMM=1
PBS_START_MOM=0
PBS_HOME=/var/spool/pbs
PBS_CORE_LIMIT=unlimited
PBS_SCP=/bin/scp
PBS_LEAF_NAME=headnode

I get the following logs when restarting the pbs process.


12/20/2023 16:30:54;0002;Server@headnode;Svr;Log;Log opened
12/20/2023 16:30:54;0002;Server@headnode;Svr;Server@headnode;pbs_version=22.05.11
12/20/2023 16:30:54;0002;Server@headnode;Svr;Server@headnode;pbs_build=mach=N/A:security=N/A:configure_args=N/A
12/20/2023 16:30:54;0002;Server@headnode;Svr;Server@headnode;hostname=headnode;pbs_leaf_name=headnode.foocluster.com;pbs_mom_node_name=N/A
12/20/2023 16:30:54;0002;Server@headnode;Svr;Server@headnode;ipv4 interface lo: localhost4.localdomain4 
12/20/2023 16:30:54;0002;Server@headnode;Svr;Server@headnode;ipv4 interface eno1: headnode.local 
12/20/2023 16:30:54;0002;Server@headnode;Svr;Server@headnode;ipv4 interface eno2: headnode.foocluster.com 
12/20/2023 16:30:54;0006;Server@headnode;Fil;Server@headnode;Version 22.05.11, started, initialization type = 1
12/20/2023 16:30:55;0002;Server@headnode;Svr;Server@headnode;pbs_status_db exit code 1
12/20/2023 16:30:55;0002;Server@headnode;Svr;Server@headnode;Starting PBS dataservice
12/20/2023 16:30:59;0002;Server@headnode;Svr;Server@headnode;connected to PBS dataservice@headnode.foocluster.com
12/20/2023 16:30:59;0d80;Server@headnode;TPP;Server@headnode(Main Thread);TPP authentication method = resvport
12/20/2023 16:30:59;0c06;Server@headnode;TPP;Server@headnode(Main Thread);TPP leaf node names = headnode.foocluster.com:15001
12/20/2023 16:30:59;0d80;Server@headnode;TPP;Server@headnode(Main Thread);Initializing TPP transport Layer
12/20/2023 16:30:59;0d80;Server@headnode;TPP;Server@headnode(Main Thread);Max files allowed = 1024
12/20/2023 16:30:59;0c06;Server@headnode;TPP;Server@headnode(Main Thread);Max files too low - you may want to increase it.
12/20/2023 16:30:59;0d80;Server@headnode;TPP;Server@headnode(Main Thread);TPP initialization done
12/20/2023 16:30:59;0d80;Server@headnode;TPP;Server@headnode(Main Thread);Connecting to pbs_comm headnode:17001
12/20/2023 16:30:59;0c06;Server@headnode;TPP;Server@headnode(Thread 0);Thread ready
12/20/2023 16:30:59;0c06;Server@headnode;TPP;Server@headnode(Thread 0);Registering address 192.168.200.203:15001 to pbs_comm headnode:17001
12/20/2023 16:30:59;0c06;Server@headnode;TPP;Server@headnode(Thread 0);Connected to pbs_comm headnode:17001
12/20/2023 16:30:59;0002;Server@headnode;n/a;setup_env;read environment from /var/spool/pbs/pbs_environment
12/20/2023 16:30:59;0000;Server@headnode;Svr;Server@headnode;Supported authentication method: resvport
12/20/2023 16:30:59;0002;Server@headnode;Svr;Server@headnode;Stopping PBS dataservice

I have firewalld disabled, and seslinux disabled.

When running qmgr I get

[sms] qmgr
Connection refused
qmgr: cannot connect to server

Let me know if I can provide more information to help identify this issue.

Had something similar in Ubuntu recently, try all these again as root

Same result as root.

How about errors at the end of the install? Postgresql server is installed and running?

Hi barns, thanks for your reply.

Postresql service was missing the directory /var/lib/pgsql/data.

Solved that running service postgresql initdb and after restarting the service I now get:

[root@sms]# /etc/init.d/pbs status
pbs_server is pid 33215
pbs_sched is pid 33041
pbs_comm is 33026
[root@sms]# systemctl status pbs
● pbs.service - Portable Batch System
     Loaded: loaded (/opt/pbs/libexec/pbs_init.d; enabled; preset: disabled)
     Active: active (running) since Tue 2023-12-26 11:04:07 -03; 1s ago
       Docs: man:pbs(8)
    Process: 33661 ExecStart=/opt/pbs/libexec/pbs_init.d start (code=exited, status=0/SUCCESS)
      Tasks: 11
     Memory: 12.5M
        CPU: 580ms
     CGroup: /system.slice/pbs.service
             ├─33708 /opt/pbs/sbin/pbs_comm
             ├─33723 /opt/pbs/sbin/pbs_sched
             ├─33791 /opt/pbs/sbin/pbs_ds_monitor monitor
             └─33895 /opt/pbs/sbin/pbs_server.bin

Dec 26 11:04:07 headnode pbs_init.d[33895]: Connecting to PBS dataservice...connected to PBS dataservice@headnode
Dec 26 11:04:07 headnode pbs_init.d[33735]: Connecting to PBS dataservice...connected to PBS dataservice@headnode
Dec 26 11:04:07 headnode pbs_init.d[33661]: PBS server
Dec 26 11:04:07 headnode systemd[1]: Started Portable Batch System.
Dec 26 11:04:07 headnode su[33898]: (to postgres) root on none
Dec 26 11:04:07 headnode su[33898]: pam_unix(su-l:session): session opened for user postgres(uid=26) by (uid=0)
Dec 26 11:04:07 headnode su[33898]: pam_unix(su-l:session): session closed for user postgres
Dec 26 11:04:07 headnode su[33938]: (to postgres) root on none
Dec 26 11:04:07 headnode su[33938]: pam_unix(su-l:session): session opened for user postgres(uid=26) by (uid=0)
Dec 26 11:04:08 headnode su[33938]: pam_unix(su-l:session): session closed for user postgres

But still I cant run qmgr.

[root@sms]# qmgr 
Connection refused

My guess is pbs_iff is not setuid root. In any case, the new server logs should be different and hold cues.

As suggested by @alexis.cousein , @bruno_b please try the below

source /etc/pbs.conf
cd $PBS_EXEC/sbin
chmod 4755 pbs_iff
chmod 4755 pbs_rcp

and then try again.