I have a version of PBSPro installed, but want to replace it with the community version of PBS. I am running RHEL 6 and would like to use the RPMs:
[root@cvmaster2 pbspro-14.1.2-0]# rpm -ivh pbspro-server-14.1.2-0.x86_64.rpm
error: Failed dependencies:
libc.so.6(GLIBC_2.14)(64bit) is needed by pbspro-server-14.1.2-0.x86_64
libhwloc.so.5()(64bit) is needed by pbspro-server-14.1.2-0.x86_64
libical.so.1()(64bit) is needed by pbspro-server-14.1.2-0.x86_64
libpython2.7.so.1.0()(64bit) is needed by pbspro-server-14.1.2-0.x86_64
postgresql-server is needed by pbspro-server-14.1.2-0.x86_64
pbs conflicts with pbspro-server-14.1.2-0.x86_64
[root@cvmaster2 pbspro-14.1.2-0]#
Is there a location to download the old Community versions of PBS, I only see the latest source. I would like to maintain the current setup with the database, etc.
Please note, if you would like to move to PBS Pro OSS , then you would have to update the entire complex ( Server and Compute nodes) to PBS Pro OSS. You cannot mix and match PBS Pro commercial and PBS Pro OSS and you cannot mix & match PBS Pro versions.
There are lot of database updates ( structure and schema) that has gone into PBS Pro 19.x since PBS Pro 13.0.1
Please take a pbs_diag of your setup on PBS Pro 13.0.1 , which has all the configuration saved.
Using this data you can re-create your queues , server configuration, scheduler configuration within a minute on the new PBS Pro server ( it would not contain job history and accounting data)
You can test this on a test VM running CentOS 6 , download and compile the rpm’s deploy PBS Pro OSS
Take a database dump of PBS Pro 13.0.1 ( your existing setup )
import this dump on the test VM installation, check whether it works
If you are looking for the sources, it is found here ( if i did not understand your query, please let me know )
Yes, both server and compute nodes
FYI: PBS Pro commercial: you can do a overlay upgrade to reach the latest version 19.2.2 , without having to loose anything, the installer would take care of everything.
I believe I tried going to the latest version of PBSPro on Rhel 6 and ran into problems with the correct version of the database. We are licensed for some of our machines, but plan on future cluster to use the community version.
So are you saying I can download the latest paid version and install it, and it will take care of the upgrade for me? And then then install the Community version? Thanks.
Please note: If you do not want the job history and job id sequence , then your migration would be quite straight forward with pbs_diag ( source /etc/pbs.conf ; $PBS_EXEC/unsupported/pbs_diag -f ) output which is tar.gz file saved in the /root folder.
[ You can back up the old $PBS_HOME for your reference ]
uninstall Pro version
install OSS version
migrate the configuration from pbs_diag output
job id’s will start from 0
You can do this but to maintain your job history and job sequence and configuration, you would still need to take
backup of $PBS_HOME after upgrading to 19.x and pbs_diag
OK, hopefully this is my last question. If I want to remove the existing Commercial version, does that mean simply removing the RPM on both the server and nodes? I am usually the only one interested in job history, thought there are four jobs currently queued up that won’t run. I ran the pbs_diag -f. The users could probably just resubmit those jobs. I also ran qmgr-c “p s”
Take a pbs_diag output when the PBS Pro 13.x is running
Take a backup of $PBS_HOME, after stopping the PBS services
rpm -qa | grep pbs | xargs rpm -e # to uninstall/remove the PBS rpms
[updated]: Manually remove the below files and folders
$PBS_HOME
$PBS_EXEC
/etc/pbs.conf
/etc/init.d/pbs
Please follow the below steps - on one of the Compute node
Take a backup of $PBS_HOME after stopping the PBS Services
Take a back of /etc/pbs.conf
Stop the PBS Services
rpm -qa | grep pbs | xargs rpm -e # to uninstall/remove the PBS rpms
You can get the details about the jobs (in the history) in the accounting logs ( $PBS_HOME/server_priv/accounting). If you need the old accounting logs, it will be present if you back up $PBS_HOME .
[root@cvmaster2 pbspro-master]# /etc/init.d/pbs start
/sbin/chkconfig
Starting PBS
PBS Home directory /var/spool/PBS needs updating.
Running /opt/pbs/libexec/pbs_habitat to update it.
Data service directory from previous PBS installation not found,
Datastore upgrade cannot continue
Failed to upgrade PBS Datastore
[root@cvmaster2 pbspro-master]#
I went back and reviewed everything. The our repos were not working, so I had to download the required RPMs. I missed one, but it looks like Postgres is not starting. This is on a Rocks system that runs Mysql.
[root@cvmaster2 pbspro-master]# /etc/init.d/pbs start
/sbin/chkconfig
Starting PBS
PBS Home directory /var/spool/PBS needs updating.
Running /opt/pbs/libexec/pbs_habitat to update it.
Data service directory from previous PBS installation not found,
Datastore upgrade cannot continue
Failed to upgrade PBS Datastore
[root@cvmaster2 pbspro-master]#
I removed everything from /var/spool that had to do with PBS and now I get:
Connecting to PBS dataservice…connected to PBS dataservice@cvmaster2.nrlmry.navy.mil
Server@cvmaster2: Server@cvmaster2, Failed to initialize PBS dataservice:[Prepare of statement insert_job failed: ERROR: relation “pbs.job” does not exist
LINE 1: insert into pbs.job (ji_jobid,ji_state,ji_substate,ji_svrfla…
^]
*** Error starting pbs server
Some fresh eyes looked at the problem. The version of Posgresql in the repo doesn’t work. We needed to download a newer version of Postgres, so that needs to be notes. Now my question is, what needs to be installed on the submission nodes and execution nodes?
The jobs are sort of running. We have a machine called cvmaster2, it is our head node. Our server. When we finally got PBS working again, it set MOM=1. I don’t want jobs running on that node.
This magically appeared. I had to manually add all other compute nodes using
qmgr -c “create compute-node”
I set MOM=0 on cvmaster2, but jobs are still being sent there. I would like to make sure jobs don’t get sent to cvmaster2, I did restart services on it after make the change.