When enabling the “memsw” functionality in the cgroup hook, the semantics of “vmem” change.
Without it “vmem” is managed by MoM and denotes the sum of the address spaces used by all processes on the mother superior node, and MoM sets a per process limit for RLIMIT_AS (since clearly a single process cannot use less address space than the sum for all processes).
But when the cgroup hook “memsw” functionality is enabled, “vmem” requests instead specify the sum of physical memory plus swap usage for the job, which is often smaller than the address space limit.
But on most current versions of PBSPro, the MoM does not know about this change in semantics for vmem, and it still sets a RLIMIT_AS limit for all processes in the job. Since many applications allocate much more address space than they actually use memory, that can make applications fail even though they are staying well within the memory+swap usage corresponding to their vmem requests.
I propose a flag for the cgroup hook to enable it to mitigate what in effect has become a MoM bug (it would not be a bug if it was possible to disable setting RLIMIT_AS from vmem, there is as of yet no ‘$enforce vmem’ functionality in the MoM config file – it is always ‘enforced’.)
Design document:
https://openpbs.atlassian.net/wiki/spaces/PD/pages/2668199965/Fix+incorrect+MoM+RLIMIT+AS+when+vmem+is+used+for+the+group+hook+s+memsw+functionality