In order to track server management operations, such as adding a new node, importing a hook script, etc., we are proposing to add a new management hook event to the PBS server. We have posted a preliminary design document on the wiki and a prototype implementation on Github. Please see https://pbspro.atlassian.net/wiki/spaces/PD/pages/1404239875/Server+Management+Hook+Event and https://github.com/ericpershey/pbspro/tree/management_hook_event for details.
Thank you for your contribution and for adhering to our documentation and development policies!
This is a sizeable review task, and will need to be assigned to development personnel. Please be patient as someone is assigned.
I looked at the design doc. Here are my suggestions:
-
Instead of pbs.event().management, how about just pbs.event.manage paramter which is a new pbs.manage type that can be instantiated as pbs.manage(). This is consistent with pbs.server(), pbs.queue(), pbs.job(), pbs.resv(), and so on.
-
What you then need is a new hook event type, say pbs.HOOK_EVENT_MANAGE that is called when server management (i.e. qmgr request) request is received and acted upon. Then pbs.event() will have the parameter pbs.event().manage that is a pbs.manage type.
-
Rather than having PBS.REVERSE_* constants, I recommend just create one Python function whose argument is an integer value that would translate and return the appropriate text string. For example, pbs.describe(pbs.manage().cmd) returns “set” or pbs.describe(pbs.manager(objtype) returns “node”.
-
The rq_reply_brp* prefix is too specific to the internal code. Just simply to a “reply” prefix as in “reply_text”, “reply_code”, “reply_auxcode”, “reply_code”.
-
Also, just simplify the rq_attrs attribute name to say attr or attrib or even attribute.
-
rq_time could just be “request_time” or maybe etime (time executed).
-
Can also remove the “al_” prefix to just say “name”, “value”, “resc”, “flags”, and “oper”.
-
The “al_sisters” could just be “sisters” and I’ll be curious what pbs python type would this be. Could it be a Python list of pbs.vnode type?
-
“TODO/FIXME: Keeping with hook design, if one management hook is rejected, the other management hooks with a higher order value will not run???” Answer: Yes
-
Might want to explicitly describe what happens when pbs.event().reject() is called. Besides terminating the hook script execution, any data stored with the server that could be affected?
Instead of pbs.event().management, how about just pbs.event.manage paramter which is a new pbs.manage type that can be instantiated as pbs.manage(). This is consistent with pbs.server(), pbs.queue(), pbs.job(), pbs.resv(), and so on.
What you then need is a new hook event type, say pbs.HOOK_EVENT_MANAGE that is called when server management (i.e. qmgr request) request is received and acted upon. Then pbs.event() will have the parameter pbs.event().manage that is a pbs.manage type.
In the C code, a rq_manage struct already existed so we named ours rq_management. To be consistent, Eric and I named the python object and class accordingly.
Rather than having PBS.REVERSE_* constants, I recommend just create one Python function whose argument is an integer value that would translate and return the appropriate text string. For example, pbs.describe(pbs.manage().cmd) returns “set” or pbs.describe(pbs.manager(objtype) returns “node”.
With a dictionary, both the keys and values are known, allowing one to iterate over either. This is convenient and not possible with a function.
The rq_reply_brp* prefix is too specific to the internal code. Just simply to a “reply” prefix as in “reply_text”, “reply_code”, “reply_auxcode”, “reply_code”.
Also, just simplify the rq_attrs attribute name to say attr or attrib or even attribute.
rq_time could just be “request_time” or maybe etime (time executed).
Can also remove the “al_” prefix to just say “name”, “value”, “resc”, “flags”, and “oper”.
Accepted.
The “al_sisters” could just be “sisters” and I’ll be curious what pbs python type would this be. Could it be a Python list of pbs.vnode type?
Accepted name change. sisters is currently a python list of pbs.server_attribute objects. Since al_sister in struct svrattrl is a pointer to a struct svrattrl, we used the corresponding pbs.server_attribute class to match. If pbs.vnode makes more sense, we need to know which fields to extract from al_sister and and assign to the pbs.vnode object.
“TODO/FIXME: Keeping with hook design, if one management hook is rejected, the other management hooks with a higher order value will not run???” Answer: Yes
Accepted.
Might want to explicitly describe what happens when pbs.event().reject() is called. Besides terminating the hook script execution, any data stored with the server that could be affected?
We will add tests for rejection. What do you mean by stored data? Our hook only emits data to the hook script and does not modify anything.