Proposed enhancements to mem/swap accounting in the cgroup hook

Some proposed changes to the cgroup, for two issues:

-it’s possible to get the scheduler to schedule some jobs on a node where swap (i.e. resources_available.vmem - resources_available.mem) will get depleted and kill jobs. To avoid that, you need to account for swap spearately as a schedulable resource.

-some sites want jobs without explicit memory limits to be allowed to use the entire host’s memory.

Design document at:

https://openpbs.atlassian.net/wiki/spaces/PD/pages/2576613377/Improve+memory+swap+management+in+the+cgroup+hook

In the section “To enable cgswap management, a site has to:” It would be good to add to the top of the list “enable the cgroups hook”.

  • what is exclhost_ignore_default set to by default?

Looks good otherwise. Thanks for proposing it.

What is the behavior if both enforce_default and exclhost_ignore_default are set to True?

" What is the behavior if both enforce_default and exclhost_ignore_default are set to True?"

That’s what you’d expect to see when you want the latter to be true. Defaults are enforced for jobs that do no use exclhost placement, but not for jobs that use exclhost placement.

enforce_default set to False and exclhost_ignore_default set to True is tautological: if defaults are not enforced for any jobs, there is no need to ignore them for a subset of jobs.

" * what is exclhost_ignore_default set to by default?"

false, which means behaviour is unchanged by default.

Thank you for making the updates. Look good