I am using PBS on a cluster where PBS is configured as follows: if one submits a job with a given walltime and lets the job run for longer than the wall time, the scheduler will not kill the job. However, it will associate a ‘red flag’ with the user who did this, and give him low priority for his next job.
To avoid this, I would like to insert manually a job termination in my PBS script.
See for example the following basic script:
#PBS -l walltime=00:01:00
#PBS -l mem=1gb
#PBS -l nodes=1:ppn=1
#PBS -q batch
How may I modify this script in such a way that the job is automatically killed with qdel after 1 minute?
If you submit the above script and if program.o runs for more than 1 minute, then it will be automatically killed by the pbs_mom, as the job has exceeded the requested walltime of 1 minute. You can see this message in the log file
02/26/2018 08:30:25;0008;pbs_mom;Job;JOBID.pbsserver;walltime XX exceeded limit 60
Like I said, this is not what happens on the cluster on which I am running the job, because of the specific PBS configuration on that cluster.
Assuming the cluster runs Linux, I would suggest checking whether ‘timeout’ command is available on the cluster.
With ‘timeout’ you could use something like:
timeout -s TERM 55 ./program.o
Without ‘timeout’ you could use something like:
kill -TERM $PID
@vchlum: This works, thank you.