STDOUT and STDERR in submisson directory

Hello,

I want jobs to be kept in the submission directory. At the moment the .o and .e files are being sent to $HOME. This is my job script

$cat submit.sh
#!/bin/bash
### Job Name
#PBS -N test_job
### Merge output and error files
#PBS -k oe
### Select 2 nodes with 36 CPUs each for a total of 72 MPI processes
#PBS -l select=1:ncpus=32

##########################################
#                                        #
#   Output some useful job information.  #
#                                        #
##########################################

echo ------------------------------------------------------
echo -n 'Job is running on node '; cat $PBS_NODEFILE
echo ------------------------------------------------------
echo PBS: qsub is running on $PBS_O_HOST
echo PBS: originating queue is $PBS_O_QUEUE
echo PBS: executing queue is $PBS_QUEUE
echo PBS: working directory is $PBS_O_WORKDIR
echo PBS: execution mode is $PBS_ENVIRONMENT
echo PBS: job identifier is $PBS_JOBID
echo PBS: job name is $PBS_JOBNAME
echo PBS: node file is $PBS_NODEFILE
echo PBS: current home directory is $PBS_O_HOME
echo PBS: PATH = $PBS_O_PATH
echo ------------------------------------------------------

myjob

After submitting it using

qsub submit.sh

I get test_job.o44 and test_job.e44 in $HOME. The test_job.o44 does show the correct submission/working directory.

PBS: working directory is /home/trumee/job/test1

How can i force the output files to stay in the submission directory?

In addition, qstat -f shows

jobdir = /home/trumee
Output_Path = myserver:/home/trumee/job/test1/test_job.o46
Error_Path = myserver:/home/trumee/job/test1/test_job.e46

So the Path variables are being set correctly, but still the .o and .e files end in $HOME. Is that because the jobdir is being set to $HOME instead of submission directory?

It is probably because of the ā€œ-koeā€. Please try submitting the job without this.

Regards,
Shwetha

I want to look at the output in real-time rather than at the end of the job. I need ā€œ-kā€ option for that.

ā€œ-koeā€ is only used to retain the jobā€™s error and output files at the end of the job on the execution host. In your case it is retaining it on the userā€™s shared home directory. Hereā€™s a small clip from the man page:

-k keep Specifies whether and which of the standard output and standard error streams is retained
on the execution host. Overrides default path names for these streams. Sets the jobā€™s
Keep_Files attribute to keep. Default: neither is retained. Overrides -o and -e options.

You must be looking for direct_write feature in PBSPro which got checked-in a week before. This will help you to monitor the stdout/stderr files in real-time if your final destination is mapped in the execution host.
Design Document:
https://pbspro.atlassian.net/wiki/spaces/PD/pages/51901651/PP-516+Direct+write+of+the+job+s+stdout+err+files.

If you want to use this feature, you may need to create a build from the mainline code.

This is targeted for PBSPro 18.1 release.

1 Like

Thanks, I am running OHPC which is using the stable release of PBS.

If I omit the ā€˜-k oeā€™ flag and specify the output/error files like

#PBS -o /home/trumee/test1/myjob.o
#PBS -e /home/trumee/test1/myjob.e

unfortunately, I dont see these files being created on the head node as the job is being run. However, I can see that the .OU and .ER files being created in /var/spool/pbs/spool/ of the slave exec node.

So is there any way to see these STDOUT files being outputted on the submission host directory?

Thanks for your post @trumee. We will be working with the OHPC folks to include the latest version of PBS once it is released. As @nithinj pointed out, you will be able to use the direct write feature to accomplish your goal. I canā€™t give you an exact date, but I suspect the latest PBS release should be available to OHPC users in the first half of 2018. The sooner the better, IMHO.

I am facing same error with version 20.0.1. Submitting the following script results no stdout file in job submitting directory. However when -koe is used stdout file can be found in my home folder.
On another cluster running pbs version 17, stdout was generated as expected.
Is this a bug with version 20.0.1?

#PBS -N test
#PBS -j oe
#PBS -l ncpus=1
#PBS -l nodes=1
#PBS -l mem=1G

echo "test_output"

Can you please try this script and check:

#PBS -N test
#PBS -l select=1:ncpus=1:mem=1gb
echo ā€œtest_outputā€
env
date
exit 0

FWIW, I use the following script (ā€œqstdoeā€) to get stderr/out at any time while the job is running:

#!/bin/sh

usage()
{
    echo "$0 <-e|-o> <job_id>"
}

if test $# -ne 2
then
    usage
    exit 1
fi

case $1 in
    "-e") stderr="yes";;
    "-o") stderr="no";;
    *) usage; exit 1;;
esac

jobid=`echo $2 | cut -d. -f1`

exec_host=`qstat -f $jobid | sed -n 's/exec_host = \(.*\)\/.*/\1/p'`

if test "x$exec_host" = "x"
then
    exit 1
fi

if test $stderr = "yes"
then
    ext=ER
else
    ext=OU
fi

ssh $exec_host cat /var/spool/pbs/spool/${jobid}.*.${ext}
1 Like