Is there a good example howto of managing output from PBS jobs?
I’ve just had a user reminisce about the good ol’ days of qpeek.
I’ve read pages 24, 44, and 46 of the Users manual and I’m a little clearer but not 100% clearer.
What is the difference between
#PBS -o
#PBS -e
and
#PBS -k oed
?
I would like to think that the first is the easier to remember and more sensible version of the second. But it occurred to me that maybe PBS doesn’t write to the error/output files while the job is running?
Is that a problem solved by koed/does koed replace qpeek?
I’ve been using PBS variants for about a decade and never heard of qpeek but I did google it and it appears you are on the right track. First I’ll attempt to directly answer your question.
PBSpro, by default, spooks output and error logs into a directory on the node where the job is running. At the end of the job the output is returned to the the submission host via scp in the directory where the user ran the qsub command. The -o and -e options allows the user to specify an alternate directory (and/or filename) to use on the submission host.
If you have network file systems where the same filesystem is mounted on the computer nodes and on the submission host, you can use the $usecp directive in the mom config file to tell the pbs mom it can return logs with the regular cp command rather than scp’ing the back to the submission host. To be clear the output is still written to the spool directory it’s just copied to the final location with cp instead of scp.
If you’ve configured $usecp… -k oed takes things a step further… this tells pbs_mom to go ahead and open the final output location and write them directly there and no copy occurs at the end of the job… This only works if the path in question matches a $usecp directive (otherwise pbs_mon makes the safe assumption that the output path is to directory that could be unique to the submission host).
So to recap:
-o and -e tells pbs where to put the output and error logs when the job is done whether that’s via scp, cp or writing the output directly to the specified location
-k oed says to write directly to the final location whether that’s path(s) the user specified with -o and -e or just the default file names in the directory the user ran qsub in.
In my opinion -k oed removes the need for a utility like qpeek (which appears to work by ssh’ing into the compute node where the job runs and cat’ing the output and error files) at least for cases where the final location is going to a network filesystem.