Pbs_server not starting

I’ve followed the instructions to make and install OpenPBS. I’m using master.

When I try to start pbs I get the following:

$ sudo /etc/init.d/pbs start
Starting PBS
/opt/pbs/sbin/pbs_comm ready (pid=68973), Proxy Name:localhost:17001, Threads:4
PBS comm
PBS mom
PBS scheduler already running.
Connecting to PBS dataservice...connected to PBS dataservice@localhost
PBS server
$ sudo /etc/init.d/pbs status
pbs_server is pid 72608
pbs_mom is pid 71804
pbs_sched is pid 71395
pbs_comm is 71785
$ sudo /etc/init.d/pbs status
pbs_server is not running
pbs_mom is pid 71804
pbs_sched is pid 71395
pbs_comm is 71785

pbs_server starts and after a few moments it stops again.

I’ve done this on a fresh install of Ubuntu.

I’ve made sure I can ssh without a password
There’s no firewall running (default on Ubuntu)
I’ve disabled apparmor (which is apparently the SELinux equivalent for Ubuntu)

This is a duplicate of issue:

Unfortunately he just says he “fixed few things about the files and shared directory” which doesn’t have any hints on how to resolve the issue. So I’ve created a new issue hopefully to trigger a new discussion on how to solve this specific problem.

From that issue, I’ve provided the information he requested:

$ qstat --version
pbs_version = 20.0.0
$ cat /etc/pbs.conf 
PBS_SERVER=paulc-VirtualBox
PBS_START_SERVER=1
PBS_START_SCHED=1
PBS_START_COMM=1
PBS_START_MOM=1
PBS_EXEC=/opt/pbs
PBS_HOME=/var/spool/pbs
PBS_CORE_LIMIT=unlimited
PBS_SCP=/usr/bin/scp

~$ ps -ef | grep pbs_
root         920       1  0 10:53 ?        00:00:00 /opt/pbs/sbin/pbs_comm
root         944       1  0 10:53 ?        00:00:00 /opt/pbs/sbin/pbs_mom
root         983       1  0 10:53 ?        00:00:00 /opt/pbs/sbin/pbs_sched

$ cat /etc/hosts
127.0.0.1	localhost
#127.0.1.1	paulc-VirtualBox
10.0.2.15       paulc-VirtualBox

::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

$ cat /var/spool/pbs/datastore/log/pbs_dataservice_log.Tue
2021-10-12 11:26:54.454 AEST [5439] LOG:  database system was shut down at 2021-10-12 11:22:01 AEST
2021-10-12 11:26:54.466 AEST [5426] LOG:  database system is ready to accept connections
2021-10-12 11:26:57.583 AEST [5426] LOG:  received fast shutdown request
2021-10-12 11:26:57.593 AEST [5426] LOG:  aborting any active transactions
2021-10-12 11:26:57.596 AEST [5426] LOG:  background worker "logical replication launcher" (PID 5445) exited with exit code 1
2021-10-12 11:26:57.596 AEST [5440] LOG:  shutting down
2021-10-12 11:26:57.672 AEST [5426] LOG:  database system is shut down

$ cat server_logs/20211012
10/12/2021 11:26:54;0002;Server@paulc-virtualbox;Svr;Log;Log opened
10/12/2021 11:26:54;0002;Server@paulc-virtualbox;Svr;Server@paulc-virtualbox;pbs_version=20.0.0
10/12/2021 11:26:54;0002;Server@paulc-virtualbox;Svr;Server@paulc-virtualbox;pbs_build=mach=N/A:security=N/A:configure_args=N/A
10/12/2021 11:26:54;0002;Server@paulc-virtualbox;Svr;Server@paulc-virtualbox;hostname=paulc-virtualbox;pbs_leaf_name=N/A;pbs_mom_node_name=N/A
10/12/2021 11:26:54;0002;Server@paulc-virtualbox;Svr;Server@paulc-virtualbox;ipv4 interface lo: localhost 
10/12/2021 11:26:54;0002;Server@paulc-virtualbox;Svr;Server@paulc-virtualbox;ipv4 interface enp0s3: paulc-VirtualBox 
10/12/2021 11:26:54;0002;Server@paulc-virtualbox;Svr;Server@paulc-virtualbox;ipv6 interface lo: ip6-loopback 
10/12/2021 11:26:54;0002;Server@paulc-virtualbox;Svr;Server@paulc-virtualbox;ipv6 interface enp0s3: paulc-VirtualBox 
10/12/2021 11:26:54;0006;Server@paulc-virtualbox;Fil;Server@paulc-virtualbox;Version 20.0.0, started, initialization type = 1
10/12/2021 11:26:54;0002;Server@paulc-virtualbox;Svr;Server@paulc-virtualbox;pbs_status_db exit code 1
10/12/2021 11:26:54;0002;Server@paulc-virtualbox;Svr;Server@paulc-virtualbox;Starting PBS dataservice
10/12/2021 11:26:57;0002;Server@paulc-virtualbox;Svr;Server@paulc-virtualbox;connected to PBS dataservice@paulc-virtualbox
10/12/2021 11:26:57;0086;Server@paulc-virtualbox;Svr;pbs_python_ext_quick_start_interpreter;--> Python Interpreter quick started, compiled with version:'3.8.10 (default, Sep 28 2021, 16:10:42) 
[GCC 9.3.0]' <--
10/12/2021 11:26:57;0086;Server@paulc-virtualbox;Svr;pbs_python_ext_quick_start_interpreter;--> Inserted Altair PBS Python modules dir '/opt/pbs/lib/python/altair' '/opt/pbs/lib/python/altair/pbs/v1'<--
10/12/2021 11:26:57;0086;Server@paulc-virtualbox;Svr;pbs_python_ext_quick_shutdown_interpreter;--> Stopping Python interpreter <--
10/12/2021 11:26:57;0d80;Server@paulc-virtualbox;TPP;Server@paulc-virtualbox(Main Thread);TPP authentication method = resvport
10/12/2021 11:26:57;0c06;Server@paulc-virtualbox;TPP;Server@paulc-virtualbox(Main Thread);TPP leaf node names = 10.0.2.15:15001,127.0.0.1:15001,10.0.2.15:15001
10/12/2021 11:26:57;0d80;Server@paulc-virtualbox;TPP;Server@paulc-virtualbox(Main Thread);Initializing TPP transport Layer
10/12/2021 11:26:57;0d80;Server@paulc-virtualbox;TPP;Server@paulc-virtualbox(Main Thread);Max files allowed = 16384
10/12/2021 11:26:57;0d80;Server@paulc-virtualbox;TPP;Server@paulc-virtualbox(Main Thread);TPP initialization done
10/12/2021 11:26:57;0d80;Server@paulc-virtualbox;TPP;Server@paulc-virtualbox(Main Thread);Connecting to pbs_comm paulc-VirtualBox:17001
10/12/2021 11:26:57;0002;Server@paulc-virtualbox;n/a;setup_env;read environment from /var/spool/pbs/pbs_environment
10/12/2021 11:26:57;0000;Server@paulc-virtualbox;Svr;Server@paulc-virtualbox;Supported authentication method: resvport
10/12/2021 11:26:57;0c06;Server@paulc-virtualbox;TPP;Server@paulc-virtualbox(Thread 0);Thread ready
10/12/2021 11:26:57;0c06;Server@paulc-virtualbox;TPP;Server@paulc-virtualbox(Thread 0);Registering address 10.0.2.15:15001 to pbs_comm paulc-VirtualBox:17001
10/12/2021 11:26:57;0c06;Server@paulc-virtualbox;TPP;Server@paulc-virtualbox(Thread 0);Connected to pbs_comm paulc-VirtualBox:17001
10/12/2021 11:26:57;0002;Server@paulc-virtualbox;Svr;Server@paulc-virtualbox;Stopping PBS dataservice

/var/spool/pbs$ cat sched_logs/20211012
10/12/2021 11:26:54;0002;pbs_sched;Svr;Log;Log opened
10/12/2021 11:26:54;0002;pbs_sched;Svr;pbs_sched;pbs_version=20.0.0
10/12/2021 11:26:54;0002;pbs_sched;Svr;pbs_sched;pbs_build=mach=N/A:security=N/A:configure_args=N/A
10/12/2021 11:26:54;0002;pbs_sched;Svr;pbs_sched;hostname=paulc-virtualbox;pbs_leaf_name=N/A;pbs_mom_node_name=N/A
10/12/2021 11:26:54;0002;pbs_sched;Svr;pbs_sched;ipv4 interface lo: localhost 
10/12/2021 11:26:54;0002;pbs_sched;Svr;pbs_sched;ipv4 interface enp0s3: paulc-VirtualBox 
10/12/2021 11:26:54;0002;pbs_sched;Svr;pbs_sched;ipv6 interface lo: ip6-loopback 
10/12/2021 11:26:54;0002;pbs_sched;Svr;pbs_sched;ipv6 interface enp0s3: paulc-VirtualBox 
10/12/2021 11:26:54;0002;pbs_sched;n/a;setup_env;read environment from /var/spool/pbs/pbs_environment
10/12/2021 11:26:54;0006;pbs_sched;Fil;pbs_sched;Version 20.0.0, started, initialization type = 0
10/12/2021 11:26:54;0002;pbs_sched;Svr;sched_main;/opt/pbs/sbin/pbs_sched startup pid 5368
10/12/2021 11:26:54;0040;pbs_sched;Fil;fairshare usage;Creating usage database for fairshare

/var/spool/pbs$ cat comm_logs/20211012 
10/12/2021 11:26:54;0002;Comm@paulc-virtualbox;Svr;Log;Log opened
10/12/2021 11:26:54;0002;Comm@paulc-virtualbox;Svr;Comm@paulc-virtualbox;pbs_version=20.0.0
10/12/2021 11:26:54;0002;Comm@paulc-virtualbox;Svr;Comm@paulc-virtualbox;pbs_build=mach=N/A:security=N/A:configure_args=N/A
10/12/2021 11:26:54;0002;Comm@paulc-virtualbox;Svr;Comm@paulc-virtualbox;hostname=paulc-virtualbox;pbs_leaf_name=N/A;pbs_mom_node_name=N/A
10/12/2021 11:26:54;0002;Comm@paulc-virtualbox;Svr;Comm@paulc-virtualbox;ipv4 interface lo: localhost 
10/12/2021 11:26:54;0002;Comm@paulc-virtualbox;Svr;Comm@paulc-virtualbox;ipv4 interface enp0s3: paulc-VirtualBox 
10/12/2021 11:26:54;0002;Comm@paulc-virtualbox;Svr;Comm@paulc-virtualbox;ipv6 interface lo: ip6-loopback 
10/12/2021 11:26:54;0002;Comm@paulc-virtualbox;Svr;Comm@paulc-virtualbox;ipv6 interface enp0s3: paulc-VirtualBox 
10/12/2021 11:26:54;0002;Comm@paulc-virtualbox;Svr;Comm@paulc-virtualbox;/opt/pbs/sbin/pbs_comm ready (pid=5345), Proxy Name:paulc-virtualbox:17001, Threads:4
10/12/2021 11:26:54;0000;Comm@paulc-virtualbox;Svr;Comm@paulc-virtualbox;Supported authentication method: resvport
10/12/2021 11:26:54;0c06;Comm@paulc-virtualbox;TPP;Comm@paulc-virtualbox(Thread 3);Thread ready
10/12/2021 11:26:54;0c06;Comm@paulc-virtualbox;TPP;Comm@paulc-virtualbox(Thread 2);Thread ready
10/12/2021 11:26:54;0c06;Comm@paulc-virtualbox;TPP;Comm@paulc-virtualbox(Thread 1);Thread ready
10/12/2021 11:26:54;0c06;Comm@paulc-virtualbox;TPP;Comm@paulc-virtualbox(Thread 0);Thread ready
10/12/2021 11:26:54;0c06;Comm@paulc-virtualbox;TPP;Comm@paulc-virtualbox(Thread 1);tfd=14, Leaf registered address 10.0.2.15:15003
10/12/2021 11:26:57;0c06;Comm@paulc-virtualbox;TPP;Comm@paulc-virtualbox(Thread 2);tfd=16, Leaf registered address 10.0.2.15:15001
10/12/2021 11:26:59;0c06;Comm@paulc-virtualbox;TPP;Comm@paulc-virtualbox(Thread 2);tfd=16, Connection from leaf 10.0.2.15:15001 down

/var/spool/pbs$ cat mom_logs/20211012 
10/12/2021 11:26:54;0002;pbs_mom;Svr;Log;Log opened
10/12/2021 11:26:54;0002;pbs_mom;Svr;pbs_mom;pbs_version=20.0.0
10/12/2021 11:26:54;0002;pbs_mom;Svr;pbs_mom;pbs_build=mach=N/A:security=N/A:configure_args=N/A
10/12/2021 11:26:54;0002;pbs_mom;Svr;pbs_mom;hostname=paulc-virtualbox;pbs_leaf_name=N/A;pbs_mom_node_name=N/A
10/12/2021 11:26:54;0002;pbs_mom;Svr;pbs_mom;ipv4 interface lo: localhost 
10/12/2021 11:26:54;0002;pbs_mom;Svr;pbs_mom;ipv4 interface enp0s3: paulc-VirtualBox 
10/12/2021 11:26:54;0002;pbs_mom;Svr;pbs_mom;ipv6 interface lo: ip6-loopback 
10/12/2021 11:26:54;0002;pbs_mom;Svr;pbs_mom;ipv6 interface enp0s3: paulc-VirtualBox 
10/12/2021 11:26:54;0100;pbs_mom;Svr;parse_config;file config
10/12/2021 11:26:54;0002;pbs_mom;n/a;set_restrict_user_maxsys;setting 999
10/12/2021 11:26:54;0002;pbs_mom;n/a;read_config;max_check_poll = 120, min_check_poll = 10
10/12/2021 11:26:54;0002;pbs_mom;Svr;pbs_mom;Adding IP address 127.0.0.1 as authorized
10/12/2021 11:26:54;0002;pbs_mom;Svr;pbs_mom;Adding IP address 10.0.2.15 as authorized
10/12/2021 11:26:54;0002;pbs_mom;Svr;set_checkpoint_path;Using default checkpoint path.
10/12/2021 11:26:54;0002;pbs_mom;Svr;set_checkpoint_path;Setting checkpoint path to /var/spool/pbs/checkpoint/
10/12/2021 11:26:54;0002;pbs_mom;n/a;ncpus;hyperthreading disabled
10/12/2021 11:26:54;0002;pbs_mom;n/a;initialize;pcpus=1, OS reports 1 cpu(s)
10/12/2021 11:26:54;0d80;pbs_mom;TPP;pbs_mom(Main Thread);TPP authentication method = resvport
10/12/2021 11:26:54;0c06;pbs_mom;TPP;pbs_mom(Main Thread);TPP leaf node names = 10.0.2.15:15003,127.0.0.1:15003,10.0.2.15:15003
10/12/2021 11:26:54;0d80;pbs_mom;TPP;pbs_mom(Main Thread);Initializing TPP transport Layer
10/12/2021 11:26:54;0d80;pbs_mom;TPP;pbs_mom(Main Thread);Max files allowed = 16384
10/12/2021 11:26:54;0d80;pbs_mom;TPP;pbs_mom(Main Thread);TPP initialization done
10/12/2021 11:26:54;0d80;pbs_mom;TPP;pbs_mom(Main Thread);Connecting to pbs_comm paulc-VirtualBox:17001
10/12/2021 11:26:54;0006;pbs_mom;Fil;pbs_mom;Version 20.0.0, started, initialization type = 0
10/12/2021 11:26:54;0002;pbs_mom;Svr;pbs_mom;Mom pid = 5355 ready, using ports Server:15001 MOM:15002 RM:15003
10/12/2021 11:26:54;0c06;pbs_mom;TPP;pbs_mom(Thread 0);Thread ready
10/12/2021 11:26:54;0c06;pbs_mom;TPP;pbs_mom(Thread 0);Registering address 10.0.2.15:15003 to pbs_comm paulc-VirtualBox:17001
10/12/2021 11:26:54;0c06;pbs_mom;TPP;pbs_mom(Thread 0);Connected to pbs_comm paulc-VirtualBox:17001
10/12/2021 11:26:54;0001;pbs_mom;Svr;net_restore_handler;net restore handler called
10/12/2021 11:26:56;0002;pbs_mom;Svr;pbs_mom;HELLO sent to server at paulc-VirtualBox:15001, stream:0
10/12/2021 11:26:56;0001;pbs_mom;Svr;pbs_mom;im_eof, Premature end of message from addr 10.0.2.15:15001 on stream 0
10/12/2021 11:26:56;0002;pbs_mom;Svr;im_eof;Server closed connection.
10/12/2021 11:26:58;0002;pbs_mom;Svr;pbs_mom;HELLO sent to server at paulc-VirtualBox:15001, stream:1
10/12/2021 11:26:59;0001;pbs_mom;Svr;pbs_mom;im_eof, Premature end of message from addr 10.0.2.15:15001 on stream 1
10/12/2021 11:26:59;0002;pbs_mom;Svr;im_eof;Server closed connection.
10/12/2021 11:27:01;0002;pbs_mom;Svr;pbs_mom;HELLO sent to server at paulc-VirtualBox:15001, stream:2
10/12/2021 11:27:01;0001;pbs_mom;Svr;pbs_mom;im_eof, Premature end of message from addr 10.0.2.15:15001 on stream 2
10/12/2021 11:27:01;0002;pbs_mom;Svr;im_eof;Server closed connection.
10/12/2021 11:27:05;0002;pbs_mom;Svr;pbs_mom;HELLO sent to server at paulc-VirtualBox:15001, stream:3
10/12/2021 11:27:05;0001;pbs_mom;Svr;pbs_mom;im_eof, Premature end of message from addr 10.0.2.15:15001 on stream 3
10/12/2021 11:27:05;0002;pbs_mom;Svr;im_eof;Server closed connection.
10/12/2021 11:27:09;0002;pbs_mom;Svr;pbs_mom;HELLO sent to server at paulc-VirtualBox:15001, stream:4
10/12/2021 11:27:09;0001;pbs_mom;Svr;pbs_mom;im_eof, Premature end of message from addr 10.0.2.15:15001 on stream 4
10/12/2021 11:27:09;0002;pbs_mom;Svr;im_eof;Server closed connection.
10/12/2021 11:27:13;0002;pbs_mom;Svr;pbs_mom;HELLO sent to server at paulc-VirtualBox:15001, stream:5
10/12/2021 11:27:13;0001;pbs_mom;Svr;pbs_mom;im_eof, Premature end of message from addr 10.0.2.15:15001 on stream 5
10/12/2021 11:27:13;0002;pbs_mom;Svr;im_eof;Server closed connection.


Please make sure ports 15001 to 15009 and 17001 are not blocked by the firewall.
And your network connection is stable and not dropping packets.

Here’s the output of nmap:

$sudo nmap -p 15001-15009 localhost
Starting Nmap 7.80 ( https://nmap.org ) at 2021-10-14 09:56 AEST
Nmap scan report for localhost (127.0.0.1)
Host is up (0.0000090s latency).

PORT      STATE  SERVICE
15001/tcp closed unknown
15002/tcp open   onep-tls
15003/tcp open   unknown
15004/tcp closed unknown
15005/tcp closed unknown
15006/tcp closed unknown
15007/tcp closed unknown
15008/tcp closed unknown
15009/tcp closed unknown

So those ports should be clear.

This a fresh install of Ubuntu.
I’ve ensured the firewall is disabled with sudo ufw disable
It’s all setup on localhost, so I don’t believe it’s dropping packets.

As a simple test I opened two terminals, in one I ran nc -l 15001 In the other I ran telnet paulc-VirtualBox 15001 I was able to send simple text successfully, so this port is definitely open. I ran sudo /etc/init.d/pbs stop first to stop the pbs services.

As an aside, according to iana, 15002 is reserved for Open Network Environment TLS.

Why localhost and not 10.0.2.15 (paulc-VirtualBox)?

Simply habit. I’m running everything on paulc-VirtualBox, so localhost is typically the same. I get the same result if I use 10.0.2.15 or paulc-VirtualBox for nmap