qstat -x shows about 4Tb of memory usage for a specific job.
the job ran on a node that only has 2tb of ram.
how is that possible?
the job was unsuccessful and the node could not copy the result back to the server node.
and the user is in a cgroup with a 200g ram limit.
how is this possible?