PBSPro store the temporary output in its spool directory, which usually stays in /var partition of the standalone compute node, and not under /home user quota.
I encounter a problem, that some user might have some kind of malformed program. It doesn’t exit after encounter an error, but continuously print “Error encountered. Retrying…” and some debug info into its stdout, once in several millisecond.
It quickly eats up the /var partition, and render other jobs / system process unstable.
Is there an option to limit the maximum size of stdout and stderr of a job (which resides on compute node), but not interfere with its other behaviour (create other file in its CWD, on shared storage)?
If that doesn’t give you enough control, you might consider writing a hook that creates a limited scratch space for each job. Here’s a nearly fourteen year old article (still accurate) on how to create and mount a virtual filesystem: https://linuxgazette.net/109/chirico.html
Of course, creating a virtual filesystem doesn’t guarantee your users will direct their output there. It may be like herding cats, depending on your users.
You might also consider changing the location of your spool directory by updating the /etc/pbs.conf file. Take a look at the variables PBS_HOME and PBS_MOM_HOME in the PBS Pro documentation. Don’t forget to restart PBS Pro if you make any changes.
The pseudo resource “file” will tell pbs_mom to set the RLIMIT_FSIZE (which applies per job process per file, not per job as a whole nor per all files as a whole) for the job and that includes the stdout/err files. The job will be allowed to finish once the limit is hit, but only the first N bytes will be captured in stdout, stderr, and any other files produced by job processes.