How to go from PBS Pro to PBS Community

I have a version of PBSPro installed, but want to replace it with the community version of PBS. I am running RHEL 6 and would like to use the RPMs:

[root@cvmaster2 pbspro-14.1.2-0]# rpm -ivh pbspro-server-14.1.2-0.x86_64.rpm
error: Failed dependencies:
libc.so.6(GLIBC_2.14)(64bit) is needed by pbspro-server-14.1.2-0.x86_64
libhwloc.so.5()(64bit) is needed by pbspro-server-14.1.2-0.x86_64
libical.so.1()(64bit) is needed by pbspro-server-14.1.2-0.x86_64
libpython2.7.so.1.0()(64bit) is needed by pbspro-server-14.1.2-0.x86_64
postgresql-server is needed by pbspro-server-14.1.2-0.x86_64
pbs conflicts with pbspro-server-14.1.2-0.x86_64
[root@cvmaster2 pbspro-14.1.2-0]#

Is there a location to download the old Community versions of PBS, I only see the latest source. I would like to maintain the current setup with the database, etc.

Not sure whether there is a repository of old compiled rpm’s for CentOS 6.
You need to build the rpm’s from the source.

Could you please lets us know the information/configuration that you would like to retain in the current setup before migrating to PBS Pro OSS ?

Please check this link:

It would be nice to retain the job history and queues already set up. We are running a rather old version of PBSPro PBSPro_13.0.1.

Are the sources around for that? And do I need to reinstall the nodes, or just the server? Thanks.

Please note, if you would like to move to PBS Pro OSS , then you would have to update the entire complex ( Server and Compute nodes) to PBS Pro OSS. You cannot mix and match PBS Pro commercial and PBS Pro OSS and you cannot mix & match PBS Pro versions.

There are lot of database updates ( structure and schema) that has gone into PBS Pro 19.x since PBS Pro 13.0.1

Please take a pbs_diag of your setup on PBS Pro 13.0.1 , which has all the configuration saved.
Using this data you can re-create your queues , server configuration, scheduler configuration within a minute on the new PBS Pro server ( it would not contain job history and accounting data)

  1. You can test this on a test VM running CentOS 6 , download and compile the rpm’s deploy PBS Pro OSS
  2. Take a database dump of PBS Pro 13.0.1 ( your existing setup )
  3. import this dump on the test VM installation, check whether it works

If you are looking for the sources, it is found here ( if i did not understand your query, please let me know )

Yes, both server and compute nodes

FYI: PBS Pro commercial: you can do a overlay upgrade to reach the latest version 19.2.2 , without having to loose anything, the installer would take care of everything.

I believe I tried going to the latest version of PBSPro on Rhel 6 and ran into problems with the correct version of the database. We are licensed for some of our machines, but plan on future cluster to use the community version.

So are you saying I can download the latest paid version and install it, and it will take care of the upgrade for me? And then then install the Community version? Thanks.

Please note: If you do not want the job history and job id sequence , then your migration would be quite straight forward with pbs_diag ( source /etc/pbs.conf ; $PBS_EXEC/unsupported/pbs_diag -f ) output which is tar.gz file saved in the /root folder.

[ You can back up the old $PBS_HOME for your reference ]

  • uninstall Pro version
  • install OSS version
  • migrate the configuration from pbs_diag output
  • job id’s will start from 0

You can do this but to maintain your job history and job sequence and configuration, you would still need to take

  1. backup of $PBS_HOME after upgrading to 19.x and pbs_diag
  2. take a pgdump of its datastore
  3. uninstall commercial
  4. install OSS
  5. import the data dump
  6. take it from there

OK, Will the Open Source CenTOS 7 run on RHEL 6? I notice the commercial version of PBS has a RHEL 6 binary and a RHEL 7.

No, it would not run. Changes in the operating system libraries.

That is correct, for PBS Pro OSS to support respective flavours of operating system , it has to be compiled from source on them.

OK, hopefully this is my last question. If I want to remove the existing Commercial version, does that mean simply removing the RPM on both the server and nodes? I am usually the only one interested in job history, thought there are four jobs currently queued up that won’t run. I ran the pbs_diag -f. The users could probably just resubmit those jobs. I also ran qmgr-c “p s”

Yes, thats correct

Please follow the below steps - on the SERVER :

  1. Take a pbs_diag output when the PBS Pro 13.x is running
  2. Take a backup of $PBS_HOME, after stopping the PBS services
  3. rpm -qa | grep pbs | xargs rpm -e # to uninstall/remove the PBS rpms
    [updated]: Manually remove the below files and folders
  • $PBS_HOME
  • $PBS_EXEC
  • /etc/pbs.conf
  • /etc/init.d/pbs

Please follow the below steps - on one of the Compute node

  1. Take a backup of $PBS_HOME after stopping the PBS Services
  2. Take a back of /etc/pbs.conf
  3. Stop the PBS Services
  4. rpm -qa | grep pbs | xargs rpm -e # to uninstall/remove the PBS rpms

You can get the details about the jobs (in the history) in the accounting logs ( $PBS_HOME/server_priv/accounting). If you need the old accounting logs, it will be present if you back up $PBS_HOME .

OK, I built from source. I did not get what I expect. when I specified a directory other than /opt/pbs I didn’t get a bin directory.

[root@cvmaster2 pbspro-master]# /etc/init.d/pbs start
/sbin/chkconfig
Starting PBS
PBS Home directory /var/spool/PBS needs updating.
Running /opt/pbs/libexec/pbs_habitat to update it.


Data service directory from previous PBS installation not found,
Datastore upgrade cannot continue
Failed to upgrade PBS Datastore
[root@cvmaster2 pbspro-master]#

I also installed :slight_smile:
[root@cvmaster2 pbspro-master]# initdb -V
initdb (PostgreSQL) 9.3.23
[root@cvmaster2 pbspro-master]#

Please let me know you followed the installation steps for your flavour of your operating system from this link https://github.com/PBSPro/pbspro/blob/master/INSTALL

I went back and reviewed everything. The our repos were not working, so I had to download the required RPMs. I missed one, but it looks like Postgres is not starting. This is on a Rocks system that runs Mysql.

[root@cvmaster2 pbspro-master]# /etc/init.d/pbs start
/sbin/chkconfig
Starting PBS
PBS Home directory /var/spool/PBS needs updating.
Running /opt/pbs/libexec/pbs_habitat to update it.


Data service directory from previous PBS installation not found,
Datastore upgrade cannot continue
Failed to upgrade PBS Datastore
[root@cvmaster2 pbspro-master]#

I have preserved everything from the previous install, so at this point, I would just like to get pbs running again.

I removed everything from /var/spool that had to do with PBS and now I get:

Connecting to PBS dataservice…connected to PBS dataservice@cvmaster2.nrlmry.navy.mil
Server@cvmaster2: Server@cvmaster2, Failed to initialize PBS dataservice:[Prepare of statement insert_job failed: ERROR: relation “pbs.job” does not exist
LINE 1: insert into pbs.job (ji_jobid,ji_state,ji_substate,ji_svrfla…
^]
*** Error starting pbs server

Some fresh eyes looked at the problem. The version of Posgresql in the repo doesn’t work. We needed to download a newer version of Postgres, so that needs to be notes. Now my question is, what needs to be installed on the submission nodes and execution nodes?

Please follow the below, thank you for notifying that the INSTALL notes should be upgrade.
Members of the community team will take care of it.

Deployment 1:

  1. Copy folders from PBS Server node to Compute nodes - preserving the permission to respective locations
  •  $PBS_HOME 
    
  •  $PBS_EXEC
    
  •  /etc/pbs.conf
    
  •  /etc/init.d/pbs
    
  1. edit /etc/pbs.conf
    PBS_SERVER=server-name-found-in-the-pbs.conf-of-server
    PBS_START_SERVER=0
    PBS_START_SCHED=0
    PBS_START_COMM=0
    PBS_START_MOM=1
    PBS_EXEC=/opt/pbs
    PBS_HOME=/var/spool/pbs
    PBS_CORE_LIMIT=unlimited
    PBS_SCP=/usr/bin/scp
    PBS_RCP=/bin/false

  2. start/restart the pbs services

Deployment 2:

  1. follow the same procedure as you did on the server

  2. update the /etc/pbs.conf as below
    PBS_SERVER=server-name-found-in-the-pbs.conf-of-server
    PBS_START_SERVER=0
    PBS_START_SCHED=0
    PBS_START_COMM=0
    PBS_START_MOM=1
    PBS_EXEC=/opt/pbs
    PBS_HOME=/var/spool/pbs
    PBS_CORE_LIMIT=unlimited
    PBS_SCP=/usr/bin/scp
    PBS_RCP=/bin/false

  3. edit $PBS_HOME/mom_priv/config and update / add the below line to point to server hostname
    $clienthost

  4. start / restart the services

The jobs are sort of running. We have a machine called cvmaster2, it is our head node. Our server. When we finally got PBS working again, it set MOM=1. I don’t want jobs running on that node.

This magically appeared. I had to manually add all other compute nodes using

qmgr -c “create compute-node”

I set MOM=0 on cvmaster2, but jobs are still being sent there. I would like to make sure jobs don’t get sent to cvmaster2, I did restart services on it after make the change.