Installation issue

Hi I just tried installing openpbs but the services don’t start. I am getting the following error:

qmgr: cannot connect to server
Connection refused
qmgr: cannot connect to server
Connection refused
qmgr: cannot connect to server
Connection refused
qmgr: cannot connect to server
Connection refused
qmgr: cannot connect to server
Connection refused
qmgr: cannot connect to server
Connection refused
qterm: could not connect to server (15010)
cp: cannot stat ‘/usr/pgsql-14.8/lib/': No such file or directory
cp: cannot stat '/usr/pgsql-14.8/lib/
’: No such file or directory
cp: cannot stat ‘/usr/pgsql-14.8/share/timezonesets/': No such file or directory
cp: cannot stat '/usr/pgsql-14.8/share/timezonesets/
’: No such file or directory
cp: cannot stat ‘/usr/lib/postgresql/14/bin/pg_resetxlog’: No such file or directory
*** End of /opt/pbs/libexec/pbs_habitat
Home directory /var/spool/pbs updated.

Thinking something is wrong with my postgres installation, I tried version 15. Now I am getting the error:

sudo /etc/init.d/pbs start

term: could not connect to server (15010)
cp: cannot stat ‘/usr/pgsql-15.3/lib/': No such file or directory
cp: cannot stat '/usr/pgsql-15.3/lib/
’: No such file or directory
cp: cannot stat ‘/usr/pgsql-15.3/share/timezonesets/': No such file or directory
cp: cannot stat '/usr/pgsql-15.3/share/timezonesets/
’: No such file or directory
cp: cannot stat ‘/usr/lib/postgresql/15/bin/pg_resetxlog’: No such file or directory
*** End of /opt/pbs/libexec/pbs_habitat
Home directory /var/spool/pbs updated.
/opt/pbs/sbin/pbs_comm ready (pid=422450), Proxy Name: xxx.url 17001, Threads:4
PBS comm
PBS mom
PBS sched

Another issue I have been having is that only pbs-sched runs:

sudo /etc/init.d/pbs status

pbs_server is not running
pbs_mom is not running
pbs_sched is pid 422460
pbs_comm is not running

My system is Ubuntu 22.04.2 LTS with kernel 5.19.0-41-generic with only one computer (no nodes)

How to get openpbs working?

After making some changes to /etc/hosts, I am still getting the following error:

$ qstat -Bf

Connection refused
qstat: cannot connect to server xxx.yyy.zzz (errno=15010)

Now pbs_mom, pbs_sced and pbs_comm are running now but not pbs_server:

$ sudo /etc/init.d/pbs status

pbs_server is not running
pbs_mom is pid 9423
pbs_sched is pid 9435
pbs_comm is 9413

How to get pbs_server up and runningnow ?

ping the server to see if its visible.
check your firewall rules.
Perhaps turn off any firewall while you try to get it running.

When I’ve seen this, often it has been because the server could not start the postgres database. What does the log file in /var/spool/pbs/server_logs say? When pbs_server startup fails, I get something like:

04/22/2023 17:32:11.131488;0002;Server@server2;Svr;Server@server2;Starting PBS dataservice
04/22/2023 17:32:25.025908;0002;Server@server2;Svr;Server@server2;Starting PBS dataservice
04/22/2023 17:32:37.969856;0002;Server@server2;Svr;Server@server2;pbs_status_db exit code 1
04/22/2023 17:32:37.969909;0002;Server@server2;Svr;Server@server2;Starting PBS dataservice
04/22/2023 17:32:52.891439;0002;Server@server2;Svr;Server@server2;Starting PBS dataservice
04/22/2023 17:33:05.716181;0002;Server@server2;Svr;Server@server2;pbs_status_db exit code 1
04/22/2023 17:33:05.716197;0002;Server@server2;Svr;Server@server2;Starting PBS dataservice ...

Sometimes, there is a problem with the permissions on a postgres directory:

sudo chmod 1777 /var/run/postgresql/

[This is for CentOS. Ubuntu might be different]

Other times, the trouble is more complicated and I debug it by manually running the following to see where it fails

sudo /opt/pbs/sbin/pbs_dataservice start