Deleting idle reservations

I’m working on a new feature that will delete reservations if they sit idle for too long. This will help utilization.

Here is the design document:
https://pbspro.atlassian.net/wiki/spaces/PD/pages/1479934178/Delete+empty+running+reservations+automatically

Please read and provide comments.

Sounds like a pretty useful feature!

How about “delete_resv_idle_time” instead of “delete_idle_resv_time” ?
Also, since you mentioned standing and ASAP reservations explicitly, could you also mention how this will affect advance and maintenance reservations?

Also, I kind of feel like admins might just want a resv to be deleted if it’s been idle for more than x% of its duration. Would it be better to have a percentage based idle timer instead of an absolute value? Then we could make it a server attribute instead so that admins don’t have to set it for each reservation. What do you think?

My $.02, a percentage based option may be a good addition in the future, but absolute time makes the most sense to to have as the baseline. The concept here is sort of like a grace period, and I don’t see how that should logically scale with reservation duration.

Thanks for mentioning maintenance reservations. I think that maintenance reservations should not be deleted for being idle. Maintenance reservations are likely to most frequently exist to deny other access to the reserved nodes rather than for specific work to run on them, deleting them for being idle would be a mistake I think (also, in keeping with other aspects of maintenance reservations, we assume the admin knows what they are doing).

2 Likes

@bhroam you might also want to mention in the design the new command line option that pbs_rsub will have to implement for accepting this timeout.

Thank you all for your comments.

Sounds good

I didn’t think I needed to mention these. This is a feature that works across all types of reservations. I only mentioned ASAP reservations because they were changing. I’m actually fixing them, in any case. The reservation doesn’t sit idle for 10m. It has a 10m periodic timer that it checks for the idle reservation. It is possible that the job ends and then a few second later that timer triggers and deletes the reservation. Now it’ll be a true idle timer.

I agree with @scc here. I like it being a static time. If admins want to make a default, they can use a rsub hook. It’s kind of a pain, but it is doable.

I thought about mentioning maintenance reservations but chose not to. I don’t think we should limit this feature. It isn’t on by default. If an admin wants to submit an idle time for a maintenance reservation, I don’t see why we shouldn’t allow them to do it. Our attitude about admins is that they know what they are doing, so let them do it.

Oh good point. I forgot examples. It’s going to be a -W option to pbs_rsub.

@scc do we need to enhance pbs_alter to alter this? I’m assuming not for the initial feature.

I’ve updated the design for comments.
Bhroam

Reviewed the latest version of the doc, just one comment: might be useful to explicitly mention that the “delete_resv_idle_time” value takes time in seconds.

You raise a good point, @agrawalravi90. It should actually take the same duration form as the pbs_rsub -D option. I’ll make the change.

Now that I am thinking about the name more, I think it is kind of redundant to have the ‘resv’ in there. We know it is a reservation attribute. We submit it with pbs_rsub. How about we make it ‘delete_idle_time’ ?

that sounds good to me. Since the format is duration now, do you wanna call it “delete_idle_duration” ?

I personally like time because it is shorter. It also is similar to walltime which is also a duration.

Bhroam

1 Like

Design looks good to me now

" If there are no jobs in a running reservation, then the resources sit idle"

Just a thought: resources will also sit idle if there are queued jobs which won’t run in the reservation. How about deleting such reservations as well? Maybe we can add an internal attribute to jobs which the sched can set when it thinks that a job can never run, and this can be checked to determine if a reservation has only jobs which will never run, so it can be marked for deletion.

I think we’d have an issue explaining what happened to the user. They’d see their reservation with 5 jobs in it just disappear. We’d have to explain somehow that we determined their jobs would never run so we deleted their reservation.

I’m curious to get @scc’s feelings on this.

Bhroam

I don’t believe it is necessary to look for jobs that will never run in the reservation. We need this for reservations which are empty.

1 Like