PP-482: Non-destructive walltime

I’ve just written up an EDD for PP-482. This will implement a new version of the walltime resource. The new resource will allow the scheduler to use policies like top jobs without the danger of jobs being killed if they exceed their walltime.

Please provide feedback: Design Doc

1 Like

Hey Bhroam,

Thanks for the descriptive doc!

Just a thought: Now that we have a way of classifying jobs via the job equivalence class code, will it useful to expose the job classes to admins so that they can set soft walltime on an entire class of jobs in one shot?

About the doc, I just wanted to confirm that in the bullet:
"If dedicated time is used, soft_walltime will be used to see if the job will finish before dedicated time starts."
you did mean to use soft-walltime for dedicated time windows right? (and not hard walltime)

Looks good. I just have a question and a few comments. In the document you mention that the default value for soft_walltime_extension_factor is 1.0. Would it make sense to list this in interface 2?

This is an interesting idea. It would be useful if a site could provide information (i.e. a python dictionary or json file with the required information such that the scheduler could do a look up) and use that without having to have to use qalter or a hook to do it. However, for now I think we should start with just allowing an admin to set a default soft walltime for each queue or server. If they want more control then for now I would recommend setting it with a hook or qalter

It makes sense to use the soft walltime for the dedicated time windows since a admin can set these without any guarantee that the jobs will be completed before the dedicated time is reached. Once it is reached then they have the option to handle the jobs as they deem necessary

The job equivalence class RFE was implemented in such a way that the admin doesn’t need to care about them. PBS determines the equivalence classes on its own. We don’t bother the admin about it. We can do a better job than the admin in any case. Doing what you suggest would give an external interface to the equivalence classes. I don’t think this is a good idea. Right now the equivalence classes can change from cycle to cycle depending on the site policies defined and the job mix. For example, in some cases the job’s euser attribute is used. In other cases it isn’t. It’ll be confusing the admin to understand what PBS is doing and why.

I’d rather see the admins modify the jobs and then have PBS create equivalence classes from that. Admins know about jobs. They understand jobs. Teaching them how equivalence classes work and how they are created is not necessary. Let’s just let them deal with what they know.

The idea is that the admin is taking the machine under their control. If a job runs over into dedicated time, the admin can choose what to do with the job. If they requeue the job, there is no real difference if it hadn’t run in the first place. We might as well try and run the job in hopes it will end by its soft_walltime.

Bhroam

Happy to see the soft walltime idea finally getting some traction – thanks! Couple comments:

  1. I suggest removing interface 2 (soft_walltime_extension_factor) for several reasons:
  • Less complexity means less work (dev, QA, doc, training) and fewer bugs
  • Hopefully, admins will set the value of soft_walltime so that it is rarely updated in practice
  • A feature like this can easily be added later (with no backward compatibility issues)
  • It doesn’t add much flexibility beyond "just doubling (up to “walltime”)
  • If something more complicated is needed (in a future release), it’s not clear that a simple, global, linear factor is the right solution; it would be better to collect feedback first, then implement the right solution.
  1. PBS Pro already has “soft” used at the end of keywords, e.g., max_run_soft. Should this be soft_walltime or walltime_soft? I actually don’t have a strong feeling on this one, but wanted to bring it up because I know there is a strong desire for consistency in naming, and wanted others to comment (if desired).

  2. Are there any interactions with min_walltime and max_walltime that need to be defined?

Thanks again!

I see you points here. And you are right that ideally this will not be used on very many jobs. The intent for this was to allow the admin to be able to adjust the soft_walltime by a factor that makes sense to them depending on how aggressive they want to be when setting the soft_walltime. As for how the factor is intended to work it would be an addition of the original requested walltime and not a doubling if set to 1.

For consistency with how we named queue/server limits it makes sense to call it walltime_soft but that really doesn’t roll of the tongue. :slight_smile: However, since it is a job attribute and not a limit I would vote to leave it as is.

Good point. I think the interactions need to be listed in the EDD

If we go on the assumption that the vast majority of jobs will only need to be extended a max of once, then I’m fine with this. We still need to define how an extension happens. I dislike just doubling the existing duration every time. I’d rather add the soft walltime on each time. If our assumption holds true, there is no difference.

I’m not so sure it’s most consistent to put it at the end. You can look at the fact that we only have soft at the end or that the vast majority of multi-word keywords read more like a sentence. Resources like min or max walltime or attributes like do_not_span_psets are examples. I agree with Jon that we leave soft at the front.

Yes, I need to update the EDD. STF jobs and soft walltime make no sense together. If you submit a job with a min walltime, you’re saying that this is the minimum amount of time that you need to run to get any real work done. The walltime can be set to the min walltime. You have to set your soft walltime smaller than your hard walltime. This means you’d have to set your soft walltime smaller than the minimum amount of time you need to get any real work done.

Bhroam

I think we should try for better consistency with naming going forward.
This would put “soft” at the end. It’s better to go with how we want
things to work in the future. Yes, we have some wacky naming, but no
need to keep that up.

Thanks for your comment, Anne. I agree that we should be consistent, but I don’t believe putting soft at the end is the most consistent thing we can do. I understand that the word soft currently only appears at the end. The vast majority of our multi-word resources and attributes read more like a sentence. I think it’d be more consistent to put soft first. It’d match min_walltime and max_walltime more closely. Following this logic, the limit attributes are the odd men out.

Bhroam

OK I’ve updated the EDD. I’ve replaced Interface 2. The old extension factor is now gone. The new interface 2 is the error message returned when soft_walltime is used with STF jobs.

I’ve removed all mention of the extension factor. Now if a job exceeds its soft walltime, it will be extended by 100% of its soft_walltime up until its hard walltime (if any).

I’ve also done a bit of housekeeping. I have italicized all resource names (e.g., soft_walltime). I’ve moved the word hard to the front of the newly italicized walltime to read a bit more smoothly.

Bhroam

Looks good. I sign off. One thing you may consider changing is to define STF before you use it in the details for interface one.

Re: The most recent design change of setting estimated.soft_walltime

Can you provide some background here – this addition came as a last minute surprise. What’s the goal, how will it be used, when is it set, what is it set to before soft_walltime is exceeded, does it get recorded in the accounting log, etc. Thx!

Yes, please do add answers to Bill’s questions. Otherwise I’ll have to
ask them again when it’s time to document the work. Thanks.

Bhroam - I would add, given the expression of surprise, could you mention what requirement and user story is being satisfied by this new interface. Does this satisfy some part of what’s in the description for PP-482 that was previously left uncovered?

Hey,
Knowing this is the second update to the feature, I’m not sure I’d call estimated.soft_walltime a last minute surprise. Before I said that there was no way the user could find out the current soft walltime of their job. I’m adding an interface that allows them to see what the current soft walltime is set to. I think this would be useful. If a top job keeps being pushed out by a job that keeps being extended, admins can use estimated.soft_walltime to figure out why. If you think the attribute isn’t useful, please let me know. As for the use case it satisfies, there is only one user story for the entire feature. All the interfaces fall under it.

As for the accounting log, no, the soft walltime is not recorded. It’s really only useful while the job is running. The resources_used.walltime is the information that is interesting to be printed in the accounting log. If someone wants to, they can do the math and figure out what the soft walltime was extended to after the job has run.

Bhroam, in regard to understanding the motivation for estimated.soft_walltime, it would be helpful to have something in the use case to which it could be tied. The closest thing right now is this sentence: “By introducing soft walltime admins will have the opportunity to look at usage statistics and predict times that should be much closer to reality then no walltime or very padded walltimes”, but I don’t think that does it.

I’m not sure how useful estimated.soft_walltime would be. I do think that a counter being incremented for each time the soft_walltime was increased that was stored in the accounting logs would be useful. This way admin can see how many times the predicted soft walltime was incremented. This would allow the admin to easily do further analysis when trying to improve the prediction

I thought of something that hasn’t been discussed yet. What should qstat report? Right now qstat will either print cput or walltime. In the case of soft_walltime, the walltime is not the duration we consider when scheduling. Should we still be printing walltime? Should we print soft_walltime? If we print soft_walltime, then what happens when we exceed the soft_walltime estimate? Should we continue printing an old estimate that is no longer valid? Should we print the most up to date one? If so, that is a reason for estimated.soft_walltime.

I also realized that I was wrong about one thing. Just knowing how many times a job has exceeded its soft_walltime doesn’t necessarily tell you what the final soft_walltime estimate is. If a job’s soft_walltime estimate would ever be extended past its walltime, it is set to the walltime.

I don’t find printing the number of times the soft_walltime has been exceeded all that useful in the accounting logs. The amount of time the job really took (resources_used.walltime) is much more useful than the number of times a job’s soft_walltime was extended. It’s telling you exactly how far off you were. That is in the accounting logs already.

I made a small change to the EDD explaining what will happen if you submit a job with cput. The answer is what you’d expect, the job is killed when it reaches its cput limit.

Bhroam

Just to make sure we are on the same page, you are talking about the “Req’d Time” column in the “alternate” qstat output format, right?

If so my opinion is that leaving things as they are currently is correct and we do not show soft_walltime in this field at all. That is, if CPU time is requested, “Req’d Time” shows CPU time. Otherwise, shows walltime. If neither were requested it should continue to show “–” (even if soft_walltime was specified for the job).

No job can ever “request” soft_walltime, it can only be assigned to a job by a manager (usually through a hook). Showing a value to users that they cannot request in a column called “requested” would be confusing. It is true that today jobs are often “assigned” a walltime by either a hook or a resources_default.walltime or resources_max.walltime and these do show up in the “Req’d Time” field, but walltime can of course actually be requested by normal job submitters.

Further, normal users will not normally even need to know about soft_walltime since it is really a tool to help the scheduler make better decisions, not something that impacts/limits their jobs (especially after they are running) like hard walltime is.

One question came into my mind while re-reading the EDD: Can a PBS manager can submit a job that explicitly requests soft_walltime, or is it always read only outside of hooks? The current EDD makes me think the answer is that a manager can explicitly request it, but it would be nice to be sure.

Also, what message would a user expect to see if they attempt to request it? “qsub: Cannot set attribute, read only or insufficient permission Resource_List.soft_walltime”?

Thanks for your response Scott.

You have made a good point about requested time. The user can directly set soft_walltime if the site wants to set it up that way. I have an example queue job hook that allows it to be set. That being said, I agree that it’s not usually going to be set that way and we shouldn’t show it under a field named requested.

You make another interesting point. That users don’t really need to know about soft_walltime. It is a tool used by the scheduler, and doesn’t really affect the user. Does the user need to see soft_walltime at all? Instead of making it only settable by a manager, should it be only manager read as well?

The need or lack of need of estimated.soft_walltime is still in question. I need a way to test soft_walltime in my automated tests. I have two options. First is estimated.soft_walltime. It can be easily queried by the automated test. If we decide to only have soft_walltime be viewable by managers, estimated.soft_walltime would only be viewable by managers as well. The other option is to print a scheduler log message every cycle for every job that has exceeded its soft_walltime. I can bury this at DEBUG3, but it can still be pretty prolific every scheduling cycle.

As for your question about managers submitting jobs with soft_walltime, the answer is no. Every qsub has user perms. A manager can turn around and qalter the job, but they can’t submit a job with soft_walltime.

I made the requested updates to the EDD. I removed estimated.soft_walltime for the time being, but it might have to go back for testing purposes.

Bhroam