Hi Folks,
Here is my use case:
We are running Openpbs version 20.0.1 on machines having CentOS 7.9.2009 (Core) OS installed on. In our hpc environment, we have multiple queues, say:
queue_1
queue_2
queue_3
queue_4
For each queue, jobs get submitted in either of below two ways:
1> user specifies the memory requested in job, as:
qsub -I -l select=1:ncpus=4:mem=XG -q queue_1
**( “X” has a wide range : starting from 1 GB, 5 GB, 10 GB, 50 GB, 100 GB, 250 GB, 500 GB etc.)
2> user submits job without asking “requested memory”
What we want ?
We want to allocate default 16 GB memory for jobs submitted “without requested memory” - How to can we do that ?
I read forum and PBS administration guide and think it can be done by >
a> updating default memory in cgroup from 256 MB to 16 GB. Current settings as below:
Qmgr: export hook pbs_cgroups application/x-config default
{
“cgroup_prefix” : “pbs_jobs”,
“exclude_hosts” : ,
“exclude_vntypes” : [“no_cgroups”],
“run_only_on_hosts” : ,
“periodic_resc_update” : true,
“vnode_per_numa_node” : false,
“online_offlined_nodes” : true,
“use_hyperthreads” : false,
“ncpus_are_cores” : false,
“cgroup” : {
“cpuacct” : {
“enabled” : true,
“exclude_hosts” : ,
“exclude_vntypes” :
},
“cpuset” : {
“enabled” : true,
“exclude_cpus” : ,
“exclude_hosts” : ,
“exclude_vntypes” : ,
“mem_fences” : true,
“mem_hardwall” : false,
“memory_spread_page” : false
},
“devices” : {
“enabled” : false,
“exclude_hosts” : ,
“exclude_vntypes” : ,
“allow” : [
“b : rwm”,
“c : rwm”
]
},
“hugetlb” : {
“enabled” : false,
“exclude_hosts” : ,
“exclude_vntypes” : ,
“default” : “0MB”,
“reserve_percent” : 0,
“reserve_amount” : “0MB”
},
“memory” : {
“enabled” : true,
“exclude_hosts” : ,
“exclude_vntypes” : ,
“soft_limit” : false,
“default” : “256MB”,
“reserve_percent” : 0,
“reserve_amount” : “64MB”
},
“memsw” : {
“enabled” : true,
“exclude_hosts” : ,
“exclude_vntypes” : ,
“default” : “256MB”,
“reserve_percent” : 0,
“reserve_amount” : “64MB”
}
}
}
However I do not have exact steps/commands on how to do this.
b> I need to set “soft_limit” to “True”, so kernel does not kill job if it exceeds default 16 GB( per PB professional guide : https://2021.help.altair.com/2021.1.2/PBS%20Professional/PBSAdminGuide2021.1.2.pdf, section : 16.5.3.9.v"
Has anyone done this exact config change ? Please share steps you followed.
I want to make sure that once I enable this, it does NOT affect other running jobs, which are submitted by users with “requested_memory” (but with variable range).
Appreciate comments from forum please.
Thanks
-Subhajit