Pbs_status_db exit code 1 - Ubuntu 20.04 fresh install

Hello,

I have installed OpenPBS v22.05.11 on a fresh install of Ubuntu 20.04

I am currently testing PBS on a computer which is acting as both the node/server.

I can’t qsub jobs or run commands like qstat or pbsnodes as it returns this error:

Connection refused
qstat: cannot connect to server coulomb (errno=15010)

The issues seems to be due to the pbs_server, specifically pbs_status_db exit code 1 in /var/spool/pbs/server_logs/20221120:

11/20/2022 14:53:53;0002;Server@coulomb;Svr;Log;Log opened
11/20/2022 14:53:53;0002;Server@coulomb;Svr;Server@coulomb;pbs_version=20.0.0
11/20/2022 14:53:53;0002;Server@coulomb;Svr;Server@coulomb;pbs_build=mach=N/A:security=N/A:configure_args=N/A
11/20/2022 14:53:53;0002;Server@coulomb;Svr;Server@coulomb;hostname=coulomb;pbs_leaf_name=N/A;pbs_mom_node_name=N/A
11/20/2022 14:53:53;0002;Server@coulomb;Svr;Server@coulomb;ipv4 interface lo: localhost
11/20/2022 14:53:53;0002;Server@coulomb;Svr;Server@coulomb;ipv4 interface enp6s0: coulomb
11/20/2022 14:53:53;0002;Server@coulomb;Svr;Server@coulomb;ipv6 interface lo: ip6-loopback
11/20/2022 14:53:53;0002;Server@coulomb;Svr;Server@coulomb;ipv6 interface enp6s0: coulomb
11/20/2022 14:53:53;0006;Server@coulomb;Fil;Server@coulomb;Version 20.0.0, started, initialization type = 1
11/20/2022 14:53:53;0002;Server@coulomb;Svr;Server@coulomb;pbs_status_db exit code 1
11/20/2022 14:53:53;0002;Server@coulomb;Svr;Server@coulomb;Starting PBS dataservice
11/20/2022 14:53:56;0002;Server@coulomb;Svr;Server@coulomb;connected to PBS dataservice@coulomb
11/20/2022 14:53:56;0d80;Server@coulomb;TPP;Server@coulomb(Main Thread);TPP authentication method = resvport
11/20/2022 14:53:56;0c06;Server@coulomb;TPP;Server@coulomb(Main Thread);TPP leaf node names = 10.65.XX.XXX:15001,127.0.0.1:15001,10.65.XX.XXX:15001 # numbers placed with X
11/20/2022 14:53:56;0d80;Server@coulomb;TPP;Server@coulomb(Main Thread);Initializing TPP transport Layer
11/20/2022 14:53:56;0d80;Server@coulomb;TPP;Server@coulomb(Main Thread);Max files allowed = 16384
11/20/2022 14:53:56;0d80;Server@coulomb;TPP;Server@coulomb(Main Thread);TPP initialization done
11/20/2022 14:53:56;0d80;Server@coulomb;TPP;Server@coulomb(Main Thread);Connecting to pbs_comm coulomb:17001
11/20/2022 14:53:56;0c06;Server@coulomb;TPP;Server@coulomb(Thread 0);Thread ready
11/20/2022 14:53:56;0c06;Server@coulomb;TPP;Server@coulomb(Thread 0);Registering address 10.65.XX.XXX:15001 to pbs_comm coulomb:17001 # numbers placed with X
11/20/2022 14:53:56;0c06;Server@coulomb;TPP;Server@coulomb(Thread 0);Connected to pbs_comm coulomb:17001
11/20/2022 14:53:56;0002;Server@coulomb;n/a;setup_env;read environment from /var/spool/pbs/pbs_environment
11/20/2022 14:53:56;0000;Server@coulomb;Svr;Server@coulomb;Supported authentication method: resvport
11/20/2022 14:53:56;0002;Server@coulomb;Svr;Server@coulomb;Stopping PBS dataservice

Regarding ports

I have firewalld stopped and disabled
I don’t have SELinux installed
I have also diabled ufw (ubuntu firewall)

Some relevant outputs:

/etc/pbs.conf

PBS_SERVER=coulomb
PBS_START_SERVER=1
PBS_START_SCHED=1
PBS_START_COMM=1
PBS_START_MOM=1
PBS_EXEC=/opt/pbs
PBS_HOME=/var/spool/pbs
PBS_CORE_LIMIT=unlimited
PBS_SCP=/usr/bin/scp

/etc/hosts

127.0.0.1       localhost

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

10.65.XX.XXX    coulomb # numbers have bee replaced with X

hostname / hostname -A
coulomb

hostname -i
10.65.XX.XXX # numbers have bee replaced with X

sudo /etc/init.d/pbs start

Starting PBS
/opt/pbs/sbin/pbs_comm ready (pid=12984), Proxy Name:coulomb:17001, Threads:4
PBS comm
PBS mom
PBS sched
Connecting to PBS dataservice...connected to PBS dataservice@coulomb
PBS server

sudo /etc/init.d/pbs status

pbs_server is not running
pbs_mom is pid 12994
pbs_sched is pid 13005
pbs_comm is 12984

ps -ef | grep pbs_

root       12984       1  0 16:24 ?        00:00:00 /opt/pbs/sbin/pbs_comm
root       12994       1  0 16:24 ?        00:00:00 /opt/pbs/sbin/pbs_mom
root       13005       1  0 16:24 ?        00:00:00 /opt/pbs/sbin/pbs_sched
ali        13233    5207  0 16:27 pts/3    00:00:00 grep --color=auto pbs_

pbs_hostn -v $PBS_SERVER

aliases:            -none-
     address length:  4 bytes
     address:         10.65.XX.XXX   (2200060170 dec)  name:  coulomb # numbers replaced by X

Other OpenBPS logs:

Some numbers in the ip addressed have been replaced with X’s

/var/spool/pbs/mom_logs/20221120

11/20/2022 16:02:20;0002;pbs_mom;Svr;Log;Log opened
11/20/2022 16:02:20;0002;pbs_mom;Svr;pbs_mom;pbs_version=20.0.0
11/20/2022 16:02:20;0002;pbs_mom;Svr;pbs_mom;pbs_build=mach=N/A:security=N/A:configure_args=N/A
11/20/2022 16:02:20;0002;pbs_mom;Svr;pbs_mom;hostname=coulomb;pbs_leaf_name=N/A;pbs_mom_node_name=N/A
11/20/2022 16:02:20;0002;pbs_mom;Svr;pbs_mom;ipv4 interface lo: localhost
11/20/2022 16:02:20;0002;pbs_mom;Svr;pbs_mom;ipv4 interface enp6s0: coulomb
11/20/2022 16:02:20;0002;pbs_mom;Svr;pbs_mom;ipv6 interface lo: ip6-loopback
11/20/2022 16:02:20;0002;pbs_mom;Svr;pbs_mom;ipv6 interface enp6s0: coulomb
11/20/2022 16:02:20;0100;pbs_mom;Svr;parse_config;file config
11/20/2022 16:02:20;0002;pbs_mom;Svr;pbs_mom;Adding IP address 10.65.XX.XXX as authorized
11/20/2022 16:02:20;0002;pbs_mom;n/a;set_restrict_user_maxsys;setting 999
11/20/2022 16:02:20;0002;pbs_mom;n/a;read_config;max_check_poll = 120, min_check_poll = 10
11/20/2022 16:02:20;0002;pbs_mom;Svr;pbs_mom;Adding IP address 127.0.0.1 as authorized
11/20/2022 16:02:20;0002;pbs_mom;Svr;set_checkpoint_path;Using default checkpoint path.
11/20/2022 16:02:20;0002;pbs_mom;Svr;set_checkpoint_path;Setting checkpoint path to /var/spool/pbs/checkpoint/
11/20/2022 16:02:20;0002;pbs_mom;n/a;ncpus;hyperthreading enabled
11/20/2022 16:02:20;0002;pbs_mom;n/a;initialize;pcpus=12, OS reports 12 cpu(s)
11/20/2022 16:02:20;0d80;pbs_mom;TPP;pbs_mom(Main Thread);TPP authentication method = resvport
11/20/2022 16:02:20;0c06;pbs_mom;TPP;pbs_mom(Main Thread);TPP leaf node names = 10.65.XX.XXX:15003,127.0.0.1:15003,10.65.XX.XXX:15003
11/20/2022 16:02:20;0d80;pbs_mom;TPP;pbs_mom(Main Thread);Initializing TPP transport Layer
11/20/2022 16:02:20;0d80;pbs_mom;TPP;pbs_mom(Main Thread);Max files allowed = 16384
11/20/2022 16:02:20;0d80;pbs_mom;TPP;pbs_mom(Main Thread);TPP initialization done
11/20/2022 16:02:20;0d80;pbs_mom;TPP;pbs_mom(Main Thread);Connecting to pbs_comm coulomb:17001
11/20/2022 16:02:20;0c06;pbs_mom;TPP;pbs_mom(Thread 0);Thread ready
11/20/2022 16:02:20;0006;pbs_mom;Fil;pbs_mom;Version 20.0.0, started, initialization type = 0
11/20/2022 16:02:20;0002;pbs_mom;Svr;pbs_mom;Mom pid = 11693 ready, using ports Server:15001 MOM:15002 RM:15003
11/20/2022 16:02:20;0c06;pbs_mom;TPP;pbs_mom(Thread 0);Registering address 10.65.XX.XXX:15003 to pbs_comm coulomb:17001
11/20/2022 16:02:20;0c06;pbs_mom;TPP;pbs_mom(Thread 0);Connected to pbs_comm coulomb:17001
11/20/2022 16:02:20;0001;pbs_mom;Svr;net_restore_handler;net restore handler called
11/20/2022 16:02:22;0002;pbs_mom;Svr;pbs_mom;HELLO sent to server at coulomb:15001, stream:0
11/20/2022 16:02:22;0001;pbs_mom;Svr;pbs_mom;im_eof, Premature end of message from addr 10.65.XX.XXX:15001 on stream 0
11/20/2022 16:02:22;0002;pbs_mom;Svr;im_eof;Server closed connection.
11/20/2022 16:02:24;0002;pbs_mom;Svr;pbs_mom;HELLO sent to server at coulomb:15001, stream:1
11/20/2022 16:02:25;0001;pbs_mom;Svr;pbs_mom;im_eof, Premature end of message from addr 10.65.XX.XXX:15001 on stream 1
11/20/2022 16:02:25;0002;pbs_mom;Svr;im_eof;Server closed connection.

/var/spool/pbs/comm_logs/20221120

11/20/2022 16:02:20;0002;Comm@coulomb;Svr;Log;Log opened
11/20/2022 16:02:20;0002;Comm@coulomb;Svr;Comm@coulomb;pbs_version=20.0.0
11/20/2022 16:02:20;0002;Comm@coulomb;Svr;Comm@coulomb;pbs_build=mach=N/A:security=N/A:configure_args=N/A
11/20/2022 16:02:20;0002;Comm@coulomb;Svr;Comm@coulomb;hostname=coulomb;pbs_leaf_name=N/A;pbs_mom_node_name=N/A
11/20/2022 16:02:20;0002;Comm@coulomb;Svr;Comm@coulomb;ipv4 interface lo: localhost
11/20/2022 16:02:20;0002;Comm@coulomb;Svr;Comm@coulomb;ipv4 interface enp6s0: coulomb
11/20/2022 16:02:20;0002;Comm@coulomb;Svr;Comm@coulomb;ipv6 interface lo: ip6-loopback
11/20/2022 16:02:20;0002;Comm@coulomb;Svr;Comm@coulomb;ipv6 interface enp6s0: coulomb
11/20/2022 16:02:20;0002;Comm@coulomb;Svr;Comm@coulomb;/opt/pbs/sbin/pbs_comm ready (pid=11683), Proxy Name:coulomb:17001, Threads:4
11/20/2022 16:02:20;0000;Comm@coulomb;Svr;Comm@coulomb;Supported authentication method: resvport
11/20/2022 16:02:20;0c06;Comm@coulomb;TPP;Comm@coulomb(Thread 0);Thread ready
11/20/2022 16:02:20;0c06;Comm@coulomb;TPP;Comm@coulomb(Thread 1);Thread ready
11/20/2022 16:02:20;0c06;Comm@coulomb;TPP;Comm@coulomb(Thread 3);Thread ready
11/20/2022 16:02:20;0c06;Comm@coulomb;TPP;Comm@coulomb(Thread 2);Thread ready
11/20/2022 16:02:20;0c06;Comm@coulomb;TPP;Comm@coulomb(Thread 1);tfd=14, Leaf registered address 10.65.XX.XXX:15003
11/20/2022 16:02:23;0c06;Comm@coulomb;TPP;Comm@coulomb(Thread 2);tfd=16, Leaf registered address 10.65.XX.XXX:15001
11/20/2022 16:02:25;0c06;Comm@coulomb;TPP;Comm@coulomb(Thread 2);tfd=16, Connection from leaf 10.65.XX.XXX:15001 down

/var/spool/pbs/sched_logs/20221120

11/20/2022 16:02:20;0002;pbs_sched;Svr;Log;Log opened
11/20/2022 16:02:20;0002;pbs_sched;Svr;pbs_sched;pbs_version=20.0.0
11/20/2022 16:02:20;0002;pbs_sched;Svr;pbs_sched;pbs_build=mach=N/A:security=N/A:configure_args=N/A
11/20/2022 16:02:20;0002;pbs_sched;Svr;pbs_sched;hostname=coulomb;pbs_leaf_name=N/A;pbs_mom_node_name=N/A
11/20/2022 16:02:20;0002;pbs_sched;Svr;pbs_sched;ipv4 interface lo: localhost
11/20/2022 16:02:20;0002;pbs_sched;Svr;pbs_sched;ipv4 interface enp6s0: coulomb
11/20/2022 16:02:20;0002;pbs_sched;Svr;pbs_sched;ipv6 interface lo: ip6-loopback
11/20/2022 16:02:20;0002;pbs_sched;Svr;pbs_sched;ipv6 interface enp6s0: coulomb
11/20/2022 16:02:20;0002;pbs_sched;n/a;setup_env;read environment from /var/spool/pbs/pbs_environment
11/20/2022 16:02:20;0006;pbs_sched;Fil;pbs_sched;Version 20.0.0, started, initialization type = 0
11/20/2022 16:02:20;0002;pbs_sched;Svr;sched_main;/opt/pbs/sbin/pbs_sched startup pid 11704
11/20/2022 16:02:20;0040;pbs_sched;Fil;fairshare usage;Creating usage database for fairshare
11/20/2022 16:02:20;0080;pbs_sched;Req;;Launching 6 worker threads

You help would be greatly appreciated!

The pbs_version reported here is 20.0.0.
Please check this link: Compiled latest version v22.05.11 prints "pbs_version = 20.0.0" - #2 by mkaro

Dear Adarsh,

Sorry for the delayed response. This has fixed the issue

Thank you,

Ali

2 Likes

Dear Ali

Hello, I have the same problem as you. How did you fixed it?

Followed the information provided in the link by adarsh:
Compiled latest version v22.05.11 prints “pbs_version = 20.0.0” - #2 by mkaro :

./configure PBS_VERSION=22.05.11 --prefix=/opt/pbs

thx, but the problem still exists.