Two new hook events: EXECJOB_POSTSUSPEND and EXECJOB_PRERESUME

This is a proposal to add two new hook events: execjob_postsuspend and execjob_preresume: https://pbspro.atlassian.net/wiki/spaces/PD/pages/1067843593/Suspend+Resume+hooks

Thanks for writing up a design on this. I have a few questions about the design.

I have the same questions for both events:

  • Will the primary mom finish this event BEFORE the sister moms will start the same event?

  • Can you explain in more detail what this means " * The order of which mom calls the event is unspecified."

The second question should answer the first, maybe it’s not worded clearly. It’s not necessary for the mother superior to finish her events first. It can be left up to implementation.

Hi Vincent,
I have a question:
Under “execjob_preresume- Additional details section”, shouldn’t the text be "All moms must accept the event before the job can be “resumed” instead of ‘suspended’ ?

You’re right, I’ve changed it.

I’ve updated the sentence to say " * The MS does not need to complete the event before the sisters can start. This will be left up to implementation."

I’ve updated the design with some new details.

Hey @vstumpf
Here are some minor comments:

  • in post_suspend, you point out that if the the job is not resumed if the hook is rejected(), gets an exception, or fails. I think this is overkill. The bullet where you talk about the hook getting an exception, you point at it being rejected. That is enough.
  • What does it mean if the hook fails? Won’t that mean it gets an exception? You already covered that, this bullet can be removed.
  • You have a bullet for mother superior and then another for the sister moms. Since all moms run the hook, make this one bullet point saying all moms run the hook. Talk about the order of them running here, not further down. (both hooks)
  • Drop visibility and change control. They are no longer needed.

You might consider using the new design guidelines. It has different sections, including one for a more detailed internal design.

Bhroam

I’ve updated the design with the new format. I’ve also addressed @bhroam’s comments.

Having these events will allow the cgroups hook to recognize when a job has been suspended and resumed. It would be a good time to explore the freezer subsystem in cgroups. I suggest you add a note to this effect in the design. Also, please try to conform to the new design doc guidelines here: https://pbspro.atlassian.net/wiki/spaces/DG/pages/293076995/PBS+Pro+Design+Document+Guidelines

Could you expand on how this proposed change will be valuable to things like the cgroups hook, and possibly other hooks that might use these generic events.

Will your changes include updates to the cgroup hook? If not, is there an open ticket describing the remaining work?

I’ve expanded the explanation of how cgroups can consume this new interface. However, the specifics of what it will do with the interfaces is out of scope of this document.

I feel like I have conformed to the design document guidelines. I have a title, overview, and technical details. I haven’t introduced or used any new complicated terms, so a glossary is unnecessary. In the technical details I list the interfaces and describe with (what I believe is) sufficient detail the changes I’m proposing. I don’t think I should be adding examples or instructions on how to use this interface, as using hooks is well-documented.

When this design gets implemented, it will probably not contain any changes to the cgroup hook. There is a ticket filed for suspend/resume interoperability.

Thanks for the design @vstumpf. I want to state here that while the work is being done to prepare to implement these new hook events if necessary, we currently do not have actual implementation of them prioritized highly. This is because we think we may (but are not sure) need the events for a high priority project and want to be ready in case we do need them (https://pbspro.atlassian.net/wiki/spaces/PD/pages/1086029883/PBS+Design+Changes+for+Shasta).

We have in the past discussed some options to support multi-level suspend and resume while using the cgroups hook (with cpuset active) by utilizing an EXECJOB_PRERESUME hook event, but our current direction on that front is to enable creation of job cpuset cgroups sharing cores at the socket boundaries rather than boxing off individual cores and use the cpu cgroup to limit jobs on the same sockets. No new hook event required.

The possibility of using the hook events discussed in this thread to change HOW we suspend/resume jobs that use cgroups (freezer cgroup vs. SIGSTOP) is interesting, but it is not a current motivating factor for implementing (or not) these particular events. That is, of course, still a valid possible use case of them.

The design looks good to me.

@vstumpf, The updated design document looks good to me. I sign off.