PP-702: Tests for installation and upgrades on Cray X-series CLE 5.2 systems

This message is to inform the community that the tests covering RPM based installations and upgrades on Cray CLE 5.2 systems is now available. You may review the document here:

https://pbspro.atlassian.net/wiki/display/PD/PP-702%3A+Tests+for+installation+and+upgrades+on+Cray+X-series+CLE+5.2+systems

Please provide comments in response to this post. Thank you!

Here are my comments @vccardenas:

In section 1.1.2 you specify specific user and group IDs when creating the pbsdata user. You should add a note that these values are used as examples and that the tester should use appropriate values. Otherwise, you could reference a nonexistent group ID or specify a conflicting user ID.

In section 1.2 you instruct the tester to add the $usecp line. The directory is incorrect, it should be PBS_HOME/mom_priv/config. Is this step optional? If so, should you mention that it is optional?

I don’t see the point of 1.6. What is it testing that hasn’t already been tested?

The “usecp” line in 1.2 differs from the one in 2.2.1.

In section 4.2 change the prompt from “$” and “crayadm@login~/>” to “login$”.

In section 4.4 the user should not have to be on another terminal. They can run apstat and qstat from the interactive job if they want.

In section 5.1.6 the command should be: xtopview -e "rpm -e pbspro-server"
This was recently updated in the Cray install instructions.

@mkaro, thanks for the comments.

section 1.1.2: I have added a note that the creation of pbsdata should use appropriate values to avoid conflicting or non-existent group id and conflicting user id.

section 1.2: Corrected “PBS_HOME/mom/config” to “PBS_HOME/mom_priv/config” and in other sections too.
I have added “Optional” before those steps and marked 3.8 accordingly.

section 1.6: I wanted to test that specifying PBS_DATA_SERVICE_USER when first starting PBS.

usecp in 1.2 differs from the one in 2.2.1: Fixed.

I have changed the prompt in sections 4.2, 4.3, 4.4 to “login$”.

section 4.4: I have added a note that apstat and qstat may also be run inside the interactive job.

section 5.1.6: corrected command to: xtopview -e “rpm -e pbspro-server”

Looks fine. You have my sign off.

Hi, I have a few comments:
In section 1.1.3.10 you should add the typical disclaimer that nid00030 is an example, and the tester should use the appropriate login node to host the mom.

In section 2.1 (and other places in section 2) please change 17.2.x to “PBS 17.x”. The same also applies to 17.2.y.

In both section 1.1.3 and section 2.1.2 and section 2.2.2 change “it would be an error” to “it is an error”.

In section 2.1.2.3 - please explain how to drain the system of running jobs.

Section 2.1.3.2 seems to be a duplicate of section 3.1, since section 2.1.3.1 already tells us to perform the checks in POST UPGRADE (section 3) we don’t need 2.1.3.2.

I don’t think the test needs most of the verbiage in section 2.2.2.

Since 2.2.3.1 already tells us to perform the check in POST UPGRADE we don’t need 2.2.3.2 (it’s covered in POST UPGRADE).

@lisa-altair,

section 1.1.3.10: I have added the disclaimer.

section 2.1 (and other places in section 2): I have changed “17.2.x” to “PBS Pro 17.x” and “17.2.y” to “PBS Pro 17.y”.
I also changed “13.0.40x” to “PBS Pro 13.0.40x”.

In both section 1.1.3 and section 2.1.2 and section 2.2.2 change “it would be an error” to “it is an error”. --> Done

section 2.1.2.3: Added steps on how to drain the system of running jobs.

Sections 2.1.3.2 and 2.2.3.2: we still need these because the POST UPGRADE section does not check for PBS_HOME. In 17.x, PBS_HOME is /var/spool/pbs . In 13.0.40x PBS_HOME is “/var/spool/PBS”.

Removed some verbiage in section 2.2.2.

Section 5.3: I added verbiage on the xtunspec commands.

Thank you @vccardenas. It looks good.

I sign off @vccardenas.

I have updated the tests with links to the steps in

https://pbspro.atlassian.net/wiki/display/PD/PP-702%3A+Installation+and+upgrades+on+Cray+X-series+CLE+5.2+systems?preview=/50658327/50725795/cray_install.txt

instead of listing the steps in the test document. I have also added a check for the status
of the server and nodes in the POST UPGRADE section.