Support for the cpu cgroup controller and zero-CPU jobs

Some sites have had the need to use the “cpu” cgroup controller either in addition to or instead of the cpuset controller.

That includes some customers who have historically padded some “zero-CPU” jobs that were mainly I/O and memory but not CPU intensive onto nodes, and Cray customers who have had cray_login nodes that would support many more jobs than there were CPU threads (they cannot use the cpuset controller, but that means that without it some “rogues” may start number-crunching on the login node and not be penalized for it, causing stress on the login nodes).

I have written a short design document about how the cgroup hook was extended at those sites, with the intent of merging those changes into the master cgroup hook so that they would no longer have to maintain this themselves and change more modern hooks to get the functionality, and also to allow other sites to leverage the features.

The document is here: