Ability to delete jobs via preemption

Currently jobs can be preempted via suspension, checkpointing, or requeuing. This RFE will add the ability to preempt via deleting the job.

The design is here: https://pbspro.atlassian.net/wiki/spaces/PD/pages/1197735939/Preemption+via+deletion

Bhroam

Thanks for proposing this change Bhroam. Just one comment:

“This means the set of letters accepted will be ‘SCRD’.”
Can you please mention what the default value will be? will it be SCRD ? Should it be something else? Since you mentioned that C and R could be overridden by the job submitter, maybe we should make it SDCR ? Just thinking out loud.

“This means if D is connected to C or R, and a job is either -c n or -r n, the job will be considered for preemption”
not sure I understood that, how can D be connected to C or R ?

Will there be an accounting record for the job that gets deleted? Will the server logs mention that a delete request from scheduler was received for the jobs that get deleted via preemption?

It would be dangerous to update the default preempt_order. Preemption would be more destructive than it was before. Realistically, if a job isn’t -r n, R should always work. In any case, before I added the D to the default, I’d need buy in from the PMs. What do you think @scc?

It has to do with how the scheduler picks jobs to preempt. If the job was -c n or -r n and we were C or R, the scheduler would notice that and ignore the job. I was trying to say that if we were RD and a job is -rn, the scheduler wouldn’t ignore the job. Now that I think about, that’s kind of obvious. I’ll just remove the statement.

The accounting records will not change. For the most part, preemption works just like root doing a qsig, qhold, or qrerun. Now we’re including qdel in the mix. This means the normal D accounting record will be printed.

Bhroam

Would it add any value to differentiate a normal D record with the one that happened due to preemption? What will the value of ‘requestor’ be in the D record?

That’s a very good question, @agrawalravi90. The delete request would be internally generated inside the server, so I’m assuming it’ll be the server. I won’t be able to tell until I actually code up a POC.

I like the design proposal. I have a question since the server will wait for the job to be finished before it replies back to scheduler about preemption, I assume it will not use ‘force’ option to delete the job, is it correct?

I do have two comments on the discussion going on here -

  • I think we should not update the default preempt_order to add ‘D’ to it. This is because of the same reason as @bhroam stated, the job preempted this way will be irrecoverable. It is better decided by admins what they want to do instead of adding ‘D’ as part of the default value.

  • About accounting records, I don’t think anything should change for accounting logs as long as the server does not use ‘Force’ option to delete the job. Currently, we log ‘D’ and ‘E’ record for the jobs that are running and then deleted using qdel command. I guess that will not change when we apply preemption with deletion.

I have the same question as @agrawalravi90, did I miss the answer?

I also agree with @arungrover, we should not by default add ‘D’ to the default preempt_order.

There will definitely be a scheduler log message. The scheduler prints a message like "Job preempted by " where is suspension, checkpoint, or requeue. We’d add a fourth for deletion.

As for a server log message, I’m not 100% sure. I wasn’t going to add anything specifically for this enhancement, but a delete request will be generated internally. I suspect there will be a log message that comes with that.

I have updated the design with more of the internal design. Please take a look.

Bhroam

I agree that D should be left out of the default preempt_order so people are not surprised with job loss where it did not happen previously (even if it is viewed as a bug by some that qsub -rn jobs are immune to preemption by requeue today and there is really nothing that can be done without this feature).

Regarding the accounting log, what if we added a new special negative exit_status to indicate a job was deleted by preemption? I don’t think there’d be any real loss of useful information from not seeing the true exit_status resulting from the job being killed by a signal. This is likely not needed if the D record does indeed show root@server_host as the requestor (which I imagine it will). Thoughts on this?

Just to wrap this up. I will leave the default preempt_order alone. I think changing the exit status is not the right way to go about informing the admin that the job was preempted. I think changing the requestor on the ‘D’ record to something like ‘Scheduler’ is the right way to go about it. I will see if I can make this happen. I’m still in the process of working out the kinks of merging this functionality into the new preemption framework.

Bhroam