Reforming @requirements Decorator in PTL

saritakh · June 29, 2021, 7:06pm

‘requirements()’ decorator in PTL is used to specify the PBS cluster configuration necessary to run a PTL test and during test run, this data is evaluated against the cluster data passed in custom parameters of pbs_benchpress. The current format of PBS cluster specification in the decorator gives only the PBS daemon counts a and certain set of flags. This causes ambiguity in actual number of nodes needed by the test to run and the node names. Hence, this design is to reform the cluster data specification in order to have clear information of the PBS cluster needed to run the test.

Please review this design:https://openpbs.atlassian.net/wiki/spaces/PD/pages/2936569857/Reforming+Requirements+Decorator+in+PTL
and let me know your views, comments and suggestions on it.

Below is the list of design updates:

requirements decorator new specification format
Update of default requirements to 2 host setup i.e. with remote mom
Format for pbs_benchpress input of large number of indexed hostnames
New parameter run_test_as in requirements decorator
Removal of certain pbs_benchpress -p custom parameters
An option to forcibly run all single mom node tests on single host instead of 2 nodes (remote mom)

Thanks,
Sarita

vstumpf · June 29, 2021, 8:08pm

I just have some comments on the idea of requirements, and what I would like to see in some iteration of PTL.

A test’s requirements should be the exact setup a test needs to run. It could be 1 server 1 mom 1 sched, maybe 3 moms, 1 server 1 remote mom, etc.
We should be able to give benchpress a list of hosts where servers, moms, and scheds can reside, and have PTL for each test determine if a test can fit on the given setup, and if it can, provide the exact requirements for the test.

This solves the problem where trying to run smoke tests with 2 moms given to benchpress will fail most tests, as the tests require only one mom.

For example, I give benchpress 1 server, 3 moms, and 1 scheduler:

I run testA, which requires 1 server, 1 mom, 1 scheduler.
PTL should be able to recognize that testA needs 1server, 1 mom, 1 scheduler, and that the given cluster has enough to satisfy these requirements. It should then create a setup for the test that is 1 server, 1 mom, 1 scheduler.
I run testB, which requires 1 server, 3 moms, 1 scheduler/
PTL sees the requirements, the cluster can satisfy the requirements, and creates a setup exactly as the requirements asked.
I run testC, which requires 1 server, 5 moms, 1 scheduler
PTL sees that the cluster can’t satsify the requirements, (5 moms > 3 moms), and skips the test.

anamika · June 29, 2021, 8:22pm

@saritakh can we have the same interface as pbs_benchpress for @requirements decorator as well? I feel that will simply the list.

instead of “host1=, host2=”
use “server=(host1),sched=(host1),mom(host1,host2),comm=(host1)”

where server and mom are mandatory and sched and comm can default to server host if not specified. Since our default config is no mom on server we need to make sure they are not provided with same host name except when using --run-on-single-node.

nithinj · June 29, 2021, 10:40pm

I like the idea!

If we follow this, it would be also helpful to let the test runner know the minimum configuration required to run all the test cases attempted. For instance, the user is trying to run SmokeTest and what is the minimum no of each daemon required to run all the tests without skipping any of them. Is it part of interface4?

also to let the test runner know which all tests will be skipped for a given configuration without really executing all the tests.

bhroam · June 30, 2021, 7:14pm

I like the idea of running on a remote mom. I don’t think the --run_on_single_node is necessary though. To run on a remote mom, you have to provide it via -p moms. How about if -p moms isn’t given, run it on the local node like it does now? If -p moms=mom1 is given, run it on the remote mom.

Bhroam

visheshh · July 1, 2021, 4:48am

Hi @saritakh ,
In the case of lets consider host1 is MOM_COMM, Should the hostname be passed in both moms=host1,comms=host1 ?

hirenvadalia · July 1, 2021, 5:05am

Do we really need -p servers, moms, comms, scheds, nomom anymore with new syntax of requirement decorator? Why don’t we remove those and one simple argument called --hosts which is list of hosts to be used in tests and let PTL configure required PBS component (i.e Mom or comm or both etc…) on given host during setUpClass/setUp of test.

Syntax of --hosts can be host[:port][@<path to pbs conf>][,host[:port][@<path to pbs conf>]]...

Ex.
@requirement(host1=MOM_COMM,host2=MOM,host3=COMM)
pbs_benchpress --hosts host1,host2,host3

Now in setupClass/setUp, PTL will configure host1 as mom & comm, host2 as mom and host3 as comm then run test

sandisamp · July 1, 2021, 7:27am

Following up on @hirenvadalia comment,
I agree with this simple implementation. The PTL framework will be able to figure out which type of installtation is present on the host. And the end user simply needs to know what kind of node cluster to present as input.

In the current implementation of requirements decorator there are a lot of test cases which get skipped during execution. As an end user i want all my test cases to be run, so i propose we should have an option in pbs_benchpress (something similar to --gen-ts-tree) like “-get-required-nodes”. This option will return something of a “one size fits all” cluster of nodes which will be capable of running most or all of the test cases given as input.

For ex
pbs_benchpress -t TestCgroupsHook,SmokeTest --get-required-nodes
will return something as
"h1=server,h2=server,h3=mom,h4=client"

Now the user can simply setup the clusters in the same manner and pass the info as nodes to pbs_benchpress -t TestCgroupsHook,SmokeTest --hosts=h1,h2,h3,h4

In my opinion this eliminates the issue of skipping tests and is relatively simple.
In case the user does not pass the recommended cluster it would run all test cases possible and skip the others. We can always compare the current cluster with recommended cluster and make adjustments for the same.

hirenvadalia · July 1, 2021, 7:57am

@sandisamp I would name option as --get-required-hosts (or --get-required-machines) (As node is kind a PBS object - which indicates MoM)

(Although this might be another RFE/project but) I would extend capability --get-required-hosts to such a way that if -t or --tags is not given then it should print json dict with all tests and its required cluster hosts info, which can be very useful to some automated tool

Best example such automated tool is ci tool, currently its really hard to find required cluster hosts count and types of setup in particular host in cluster, with --hosts and --get-required-hosts its really simple

saritakh · July 1, 2021, 6:50pm

Thank you everyone for the valuable inputs. I am working on them, will get back as soon as I am ready.

saritakh · July 6, 2021, 9:01am

Hi All,
Thank you for all the valuable suggestions. Please review the updated design, below are the summary of what got updated from previous version:
https://openpbs.atlassian.net/wiki/spaces/PD/pages/2936569857/Reforming+Requirements+Decorator+in+PTL

Updated ‘–hosts‘ instead of ‘-p servers=,moms=comms=’ and the details
Named the interface “–get-required-hosts”
Removed --run-on-single-node interface keeping the functionality based on the inputs to ‘–hosts‘
Updated for detection of type of installation by PTL on each hosts
Removal of interfaces -p servers, -p moms & -p comms

Let me know of any comments, suggestions or changes.
Thanks

saritakh · July 6, 2021, 9:02am

Yes @vstumpf, that is exactly the plan of this design. Tests requiring equal or lesser hosts than the number and type of hosts mentioned in pbs_benchpress will run; rest all will get skipped.

saritakh · July 6, 2021, 9:03am

@anamika
‘pbs_benchpress input’ is focused on list of hostnames & their type of installations. - indicates the number of nodes available to run the tests
‘test requirements information’ is focused on the daemons necessary on each of its needed nodes in order to test the feature – this indicates the number of nodes necessary to run the test

Having same interface for both would cause the same node name mentioned multiple times wrt daemons in -p input in pbs_benchpress. Instead I feel it is better to mention hosts and their installation types or detect the installation types of the hostnames passed. PTL internally will be aware of daemons available with the installation type and hence can switch on or off the daemons based on test’s requirements.

saritakh · July 6, 2021, 9:04am

@nithinj
Yes, Interface 4 is for the same – as suggested by Hiren & Sanidhya, I name it as --get-required-hosts.
Regarding the list of tests that will be skipped due to the insufficiency of hosts; will have to look into the internal design for this.

saritakh · July 6, 2021, 9:05am

@bhroam
I removed the option --run-on-single-node and set this behavior for below two cases:
When no hostname is specified in ‘–hosts‘ in pbs_benchpress
When single hostname is specified in ‘–hosts‘ in pbs_benchpress

saritakh · July 6, 2021, 9:06am

@visheshh
I initially thought of having rpm types of options in pbs_benchpress input (server rpm, execution rpm & client rpm); but had added comms due to ease of specification. Yes, to answer your question, we would have to mention it multiple times in the previous format, i.e. if we are bifurcating as per daemons (instead of installation rpm types).
I have now changed the interface as common option ‘–hosts‘ and PTL detecting the type of installation.

saritakh · July 6, 2021, 9:06am

@hirenvadalia
I did think about ‘–hosts’ which was part of your initial suggestion on this RFE; but got diverted towards daemons specification since the challenge in single list of hostnames specification is the sequence of node types which might vary in the user input. For example in case where the requirements expects comm on host3, but what if 3rd hostname specified in pbs_benchpress is an execution rpm installed node.
This proposal works in case where all nodes have server rpm installation. Or, in case where PTL detects the rpm type installed on each node and accordingly matches to the requirements. I updated the design as per the latter case.

Also, regarding the suggestion of required hosts being part of json output; it can be looked upon during implementation of --get-required-hosts.

saritakh · July 6, 2021, 9:07am

@sandisamp
Yes, I agree on yours and Hiren’s suggestions; I have updated the design accordingly.

hirenvadalia · July 7, 2021, 4:20am

Thanks @saritakh for updating design. few minor comments:

Not sure but should we allow range in @requirements() arg also? like @requirements(host1=SERVER, host_2_5=MOM, host6=COMM) where host_2_5 means host[2-5]
In interface 2, we should explicitly mention that while using range and port and/or config path is given then it will be same for all host in given range
In interface 3, we should also mention about --tags along with -t
in interface 4, we should not hard code username but use users from pbs_testusers.py

saritakh · July 7, 2021, 9:10am

Thank you @hirenvadalia,
I completely agree on point 1 without which this design would have been incomplete. I have updated all your comments. Please review and let me know your further comments.

Topic		Replies	Views
PP-1281: New decorator in PTL using which user can provide cluster information required for a test Developers	40	2114	February 21, 2019
Addition of hardware requirements option to existing @requirements decorator Developers	5	701	August 14, 2019
Added a option "mom_on_server"=<boolean> to existing PTL requirement decorator Developers	5	429	February 11, 2021
PP-239 and PP-659: Decorator to skip PTL tests on cpuset mom; Specifying default test case time out value while running PTL tests Developers	9	1469	March 23, 2017
Making PTL testsuite independent of PBSTestSuite Developers	8	910	August 1, 2019

Reforming @requirements Decorator in PTL

Related topics