Hi,
I’m trying the PBSpro for the first time, aiming to replace the no-longer-opensource Torque.
I followed the INSTALL file instructions and can build the code on a CentOS 6.10 test machine from scratch.
However, when I tried to start the service, it gives the following error:
[root@c6 libexec]# service pbs restart
Restarting PBS
Stopping PBS
PBS sched - was pid: 5019
PBS comm - was pid: 5004
Waiting for shutdown to complete
/sbin/chkconfig
Starting PBS
/cluster/pbspro/sbin/pbs_comm ready (pid=6007), Proxy Name:c6:17001, Threads:4
PBS comm
Creating usage database for fairshare.
PBS sched
Connecting to PBS dataservice…connected to PBS dataservice@c6
Server@c6: Server@c6, Failed to initialize PBS dataservice:[Prepare of statement insert_job failed: ERROR: relation “pbs.job” does not exist
LINE 1: insert into pbs.job (ji_jobid,ji_state,ji_substate,ji_svrfla…
^]
pbs_server startup failed, exit 255 aborting.
Did I miss something in the process?
Mike Chen
Research Assistant
Dept. of Atmospheric Science
National Taiwan University
The problem does not present in v14.1.2…
Not sure about the cause, maybe PostgreSQL version compatilibity?
Since the target system is CentOS 6, I’ll test with v14.1.2 first.
Input welcomed
Hi,
I have not seen this problem myself but since the error is related to db schema you have rightly pointed that PostgreSQL incompatibility might be causing this.
There were some changes made specific to database in version 19 and it would require you to install postgresql-contrib package on your system. Can you please install that package and try rebuilding/installing PBS again?
Hi,
Thanks for the reply!
I tried with v18.1.3 and it works.
So as you expected, the problem’s on v19 only.
I tried with v19.1.1 again, but found that I already had that postgresql-contrib package installed when the problem occurred.
Both the postgresql and postgresql-contrib are in version 8.4.20-8.el6.9.
Yes, you are right, that the reason you are facing issue while starting the database v8.4.20.
In the latest version of PBS Pro v19.1, we have started using the new feature “hstore” module that was not required in PBS Pro v14.1 and PBS Pro v18.1.
I have tried in same platform CentOS 6 with the same version of postgres 8.4.20 alongside contrib package. Hit with the same issue while staring the services.
Current database schema and scripts are compatible with Postgres version which supports “create extension” feature. (say from 9.4 onwards)
Hi,
Great to have the cause confirmed!
For curiosity’s sake, I tried to build PostgreSQL 9.4.21 from source on CentOS 6, and installed it in /opt/pgsql.
But the PBSpro configure.sh can’t find the database headers:
checking for PBS database directory… configure: error: Database headers not found.
I tried to add the CPPFLAGS in configure.sh, but not working.
Hi!
Thanks for the info!
I can now compile pbspro with source-built postgreSQL 9.4.21 on CentOS 6.
I did some search, it looks like the postgresql can’t be run as root.
So I created a system account postgres with:
useradd postgres -d /var/lib/pgsql -m -r
After the pbs_postinstall, I got this error when starting the pbs service:
PBS Home directory /var/spool/pbs needs updating.
Running /opt/pbs/libexec/pbs_habitat to update it.
*** psql command is not in PATH
But actually the psql is in PATH:
[root@c6 pbspro]# which psql
/opt/pgsql/bin/psql
Anyway, following the message, I ran the pbs_habitat.
It does not complete, but stuck at the step:
Connecting to PBS dataservice…
By the “ps ax|grep post” output, I can see the postgreSQL server is started, and the database is initialized:
[root@c6 pbspro]# ps ax|grep post
1260 ? Ss 0:00 /usr/libexec/postfix/master
27225 ? S 0:00 /opt/pgsql/bin/postgres -D /var/spool/pbs/datastore -p 15007
27228 ? Ss 0:00 postgres: logger process
27230 ? Ss 0:00 postgres: checkpointer process
27231 ? Ss 0:00 postgres: writer process
27232 ? Ss 0:00 postgres: wal writer process
27233 ? Ss 0:00 postgres: autovacuum launcher process
27234 ? Ss 0:00 postgres: stats collector process
So I have no idea why the pbs_habitat stuck.
Looking for more help on this; I think I’m quite close.
Might be some system issue, as you have mentioned “psql” command is configured in the path, it should be executed without any compliant.
I am guessing, below way should help.
Stop the running postgres.
Remove the PBS HOME dir (rm -rf /var/spool/pbs)
If you have installed the pgsql source built in a directory (say contents are bin/include/lib/share sub-dirs) copy the whole directory to /opt/pbs/pgsql and chown to database user.
steps:
a) mkdir /opt/pbs/pgsql
b) cp -R <path/to/source/built> to /opt/pbs/pgsql
c) chown -R postgres:postgres /opt/pbs/pgsql
run pbs_habitat (/opt/pbs/libexec/pbs_habitat)
/opt/pbs/libexec/pbs_postinstall
Please share your feedback after following these steps.
And, I see you have configured prefix as “/cluster/pbspro/” in previous logs, then please replace “/opt/pbs/” with your configured prefix directory in my previous steps. I have given steps with default path where PBS Pro installs.
Here, the idea is we are providing another copy of built binaries in <PBS_HOME>/pgsql directory. so that pbs_habitat script should pick the path from any of these locations.
Ouch!
I tried your step but the problem remains.
The posgres is there, so I tried to connecting the DB directly, but hey:
[root@c6 postgresql-9.4.21]# psql -h c6 -p 15007
psql: could not connect to server: Network is unreachable
Is the server running on host “c6” (192.168.20.6) and accepting
TCP/IP connections on port 15007?
The firewall is off, then I found the problem is in the name resolution.
The IP in /etc/hosts is incorrect.
The pbs_habitat can connect to the DB after the IP fixed.
However… now I have that “Failed to initialize PBS dataservice” problem again.
The postgres version is confirmed to be 9.4.21:
Connecting to PBS dataservice…connected to PBS dataservice@c6
Server@c6: Server@c6, Failed to initialize PBS dataservice:[Prepare of statement insert_job failed: ERROR: relation “pbs.job” does not exist
LINE 1: insert into pbs.job (ji_jobid,ji_state,ji_substate,ji_svrfla…
^]
*** Error starting pbs server
I would like to see your Postgres compilation steps.
I have just built Postgres 9.4.21 in CentOS 6 and tried to create extension hstore. It worked fine, then i will proceed to build PBS Pro with this compiled pgsql.
To validate your pgsql compilation, plz connect to DB and try these steps.
postgres=# select version();
version
PostgreSQL 9.4.21 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23), 64-bit
(1 row)
postgres=# create extension hstore;
postgres=# \dx
List of installed extensions
Name | Version | Schema | Description
---------±--------±-----------±-------------------------------------------------
hstore | 1.3 | public | data type for storing sets of (key, value) pairs
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
(2 rows)
Please follow instructions given in Postgres INSTALL file and also to compile & install extensions.
you need to a) ./configure , b) make world & c) make install-world
Sorry I didn’t see your reply before update!
To clear things up: after (I think) I understands the process better, I recreate the test environment again, down from installing a fresh copy of CentOS.
I uses --prefix=/opt/pbs since then.
So the recent test results are not from the environment I begin this post with.
It’s weird that the problem remains even if I install the postgres directly to /opt/pbs/pgsql (and then chown).
The symptom is the same with: postgresql 8.4 from repository, 9.4.21 from source, and 9.6.12 from source.
From what I guess, I think the DB creation by the postgres is okay; the pbspro can connect to the DB, but problem happens when it tried to create DB structures.
We already knows the postgres 8.4 lacks the features pbspro 19.x requires; but that can’t explain why it fails with the same message on source-built 9.4.21 and 9.6.12.
What did I missed? Do I have to install some extra features or modules of postgres?
*** Postinstall script called as follows:
*** /opt/pbs/libexec/pbs_postinstall ‘’
*** No configuration file found.
*** Creating new configuration file: /etc/pbs.conf
*** Replacing /etc/pbs.conf with /etc/pbs.conf.19.0.0
*** /etc/pbs.conf has been created.
*** Registering PBS Pro as a service.
*** Systemctl binary is not available; Failed to register PBS Pro as a service
*** PBS_HOME is /var/spool/pbs
*** Setting TZ from /etc/sysconfig/clock
*** Creating new file /var/spool/pbs/pbs_environment
*** The PBS Pro server has been installed in /opt/pbs/sbin.
*** The PBS Pro scheduler has been installed in /opt/pbs/sbin.
*** The PBS Pro communication agent has been installed in /opt/pbs/sbin.
*** The PBS Pro MOM has been installed in /opt/pbs/sbin.
*** The PBS commands have been installed in /opt/pbs/bin.
*** End of /opt/pbs/libexec/pbs_postinstall
[root@selinux-training pbspro]# vi /etc/pbs.conf
[root@selinux-training pbspro]# /etc/init.d/pbs start
/sbin/chkconfig
Starting PBS
PBS Home directory /var/spool/pbs needs updating.
Running /opt/pbs/libexec/pbs_habitat to update it.
*** Setting default queue and resource limits.
Connecting to PBS dataservice…connected to PBS dataservice@selinux-training
*** End of /opt/pbs/libexec/pbs_habitat
Home directory /var/spool/pbs updated.
/opt/pbs/sbin/pbs_comm ready (pid=45156), Proxy Name:selinux-training:17001, Threads:4
PBS comm
PBS mom
Creating usage database for fairshare.
PBS sched
Connecting to PBS dataservice…connected to PBS dataservice@selinux-training
Licenses valid for 10000000 Floating hosts
PBS server
[root@selinux-training pbspro]# cat /etc/release
CentOS release 6.8 (Final)
LSB_VERSION=base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
cat: /etc/lsb-release.d: Is a directory
CentOS release 6.8 (Final)
CentOS release 6.8 (Final)
cpe:/o:centos:linux:6:GA