Jobs not preempted when setting rerunnable to false

We are testing our queue priorities and noticed that when you set -r n on a job that would normally be preemptible, it is no longer preempted. Without the -r preemption is successful and the job is re-queued as normal.

Our preemptible queue has a priority of 0 and our priority queue has a priority of 315. I am submitting both jobs as the same user.

Is this the expected behavior? I can provide more information if needed.

  • by default all job is rerunnable (unless specified or set in qsub_default_arguments to -r n )
  • if you specify -r n will mark the job as not rerunnable

Please check this

source /etc/pbs.conf; cat $PBS_HOME/sched_priv/sche_config   | grep preempt | grep -v '#'
preemptive_sched: true	ALL
preempt_queue_prio:	150
preempt_prio: "express_queue, normal_jobs"
preempt_order: "SCR"
preempt_sort: min_time_since_start

Note: All jobs with preemption priority higher than normal jobs. Preemption priority is defined in scheduler’s preempt_prio parameter

Please check the Job Classes, preempt_prio, preempt_order in the PBS Professional Administrator guide https://www.pbsworks.com/pdfs/PBS19.2.3_BigBook.pdf

Hi Adarsh,

Thank you for the response. Here is what we have for those lines:

preemptive_sched: true	ALL
preempt_queue_prio:	150
preempt_prio: "express_queue"
preempt_order: "CR"
preempt_sort: min_time_since_start

I think we have preemption configured correctly based on the goals we are trying to achieve. And as I mentioned, the preemption works correctly when the rerunnable flag is not set. It’s only when I set -r n for the preemptible job that preemption stops working. I don’t want to say for sure that this is a bug, but I tested this multiple times, and setting r -n somehow prevents the job from being preempted.

Could this have to do with not having a checkpoint script, since our preempt_order is set to CR? Maybe it is just taking longer for PBS Pro to preempt the job because it is trying to checkpoint the job and is not able to?

I can provide job scripts if you want to try reproduce the error.

Thanks,
Nick

PBS will not preempt non-rerunable jobs by requeue, so if that is the only viable option to preempt then non-rerunable jobs are indeed immune to preemption. There is a new feature (not yet in any released version) here: https://github.com/PBSPro/pbspro/pull/1138 that allows you to add “D” for delete as a preemption method. If specified after “R”, the requeue attempt for a non-rerunable job will fail (as it does now) and the job will be deleted.

1 Like