PBS - memory ressource (pbs_cgroup)

Dear all PBS users,

I looked on the forum but I have not really found a topic treating the issue I am facing.
We are running pbs version 20.0.0 on machines having CentOS 7.9.2009 (Core) OS installed on.

Within qmgr I checked that the pbs_cgroup hook is turned on:

Qmgr: list hook
Hook pbs_cgroups
type = site
enabled = true
event = execjob_begin,execjob_epilogue,execjob_end,execjob_launch,
execjob_attach,
execjob_resize,
execjob_abort,
exechost_periodic,
exechost_startup
user = pbsadmin
alarm = 120
freq = 10
order = 100
debug = false
fail_action = offline_vnodes

and I also checked the associated configuration file (the default one):

Qmgr: export hook pbs_cgroups application/x-config default
{
“cgroup_prefix” : “pbs_jobs”,
“exclude_hosts” : ,
“exclude_vntypes” : [“no_cgroups”],
“run_only_on_hosts” : ,
“periodic_resc_update” : true,
“vnode_per_numa_node” : false,
“online_offlined_nodes” : true,
“use_hyperthreads” : false,
“ncpus_are_cores” : false,
“discover_gpus” : true,
“manage_rlimit_as” : true,
“cgroup” : {
“cpuacct” : {
“enabled” : true,
“exclude_hosts” : ,
“exclude_vntypes” :
},
“cpuset” : {
“enabled” : true,
“exclude_cpus” : ,
“exclude_hosts” : ,
“exclude_vntypes” : ,
“mem_fences” : false,
“mem_hardwall” : false,
“memory_spread_page” : false
},
“devices” : {
“enabled” : false,
“exclude_hosts” : ,
“exclude_vntypes” : ,
“allow” : [
“b : rwm”,
“c : rwm”
]
},
“memory” : {
“enabled” : true,
“exclude_hosts” : ,
“exclude_vntypes” : ,
“soft_limit” : false,
“enforce_default” : true,
“exclhost_ignore_default” : false,
“default” : “256MB”,
“reserve_percent” : 0,
“reserve_amount” : “1GB”
},
“memsw” : {
“enabled” : false,
“exclude_hosts” : ,
“exclude_vntypes” : ,
“enforce_default” : true,
“exclhost_ignore_default” : false,
“default” : “0B”,
“reserve_percent” : 0,
“reserve_amount” : “64MB”,
“manage_cgswap” : false
},
“hugetlb” : {
“enabled” : false,
“exclude_hosts” : ,
“exclude_vntypes” : ,
“enforce_default” : true,
“exclhost_ignore_default” : false,
“default” : “0B”,
“reserve_percent” : 0,
“reserve_amount” : “0B”
}
}
}

It shows that the default memory limit is set to “256Mb” with a soft_limit being set to False.

Now submitting a job presenting the following the header:

#!/bin/bash -l
#PBS -l walltime=96:00:00
#PBS -l nodes=node1:ppn=1
#PBS -l mem=25gb
#PBS -q Q1
#PBS -o job.out
#PBS -e job.err

I see using qstat -f that the job is using resources_used.mem = 262144kb and a resources_used.vmem = 5218548kb while the requested ressources have been correctly understood by the system: Resource_List.select = 1:ncpus=1:mem=26214400KB:host=node1.

If I looked into the system files to get what is the actual memory limit set to my job, I get that:

cat /sys/fs/cgroup/memory/pbs_jobs.service/jobid/1964.NODE1/memory.limit_in_bytes
268435456

Therefore, I understand that PBS associated to my job a memory limit equal to 268.mb while I was requesting 25gb! My machine has 376Gb (resources_available.mem = 376355mb) of RAM available with 368Gb available at the moment of running this job.

What should I do to get my job running on the RAM?

Thank you in advance for your help!

Keep in mind that vmem = RAM + swap. You should request both mem and vmem, and you probably want them set to the same value.

Dear mkaro,

Thank you for your answer. So if I correctly understand, your point is that I should always set also the amount of vmem = mem in order to force the system to avoid using swap? I will make a test…

But how it comes that the system decides that it is more efficient to start using swap while there is still about 300Gb of free RAM?

Edit: I performed the test setting the same value for mem and vmem PBS is still limiting to 268Mb the “cgroup” memory file associated to my job.

In your configuration file, I see that memsw is set to false. Please set that to true and try again. Assuming there is sufficient free memory, the Linux kernel will allow the job to allocate memory up to the mem limit. Once the mem limit is exceeded, it will allow the application to allocate up to the vmem limit and start paging out allocated RAM to swap. Once the vmem limit is reached, calls to malloc/calloc will return an error and set errno to ENOMEM. It won’t kill the process, but if the system is close to full memory capacity, the OOM killer will be invoked.