Hi,
I set the job memory to 2GB, however, the process is killed by OOM at a much lower usage.
$ qstat -fx 1278272 | grep mem
resources_used.mem = 262144kb
resources_used.vmem = 1480128kb
Resource_List.mem = 2gb
Resource_List.vmem = 4gb
On the node I see:
[Thu Nov 18 15:40:20 2021] Task in /pbs_jobs.service/jobid/1278272.******** killed as a result of limit of /pbs_jobs.service/jobid/1278272.********
[Thu Nov 18 15:40:20 2021] memory: usage 262144kB, limit 262144kB, failcnt 1321027
[Thu Nov 18 15:40:20 2021] memory+swap: usage 262144kB, limit 9007199254740988kB, failcnt 0
[Thu Nov 18 15:40:20 2021] kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
[Thu Nov 18 15:40:20 2021] Memory cgroup stats for /pbs_jobs.service/jobid/1278272.********: cache:9024KB rss:253120KB rss_huge:6144KB mapped_file:0KB swap:0KB inactive_anon:129756KB active_anon:123308KB inactive_file:4668KB active_file:4000KB unevictable:0KB
So somehow 2gb seems to be translated to 262144kb… For which I can find no other explanation but that “gb” is interpreted as gibibit…
Furthermore, decreasing mem to “1gb” lets the job run fine!
What could it mean?