PP-506,PP-507: Add support for requesting resources with logical 'or' and conditional operators

arungrover · March 21, 2017, 5:55pm

I just wanted to minimize the overhead of having more than one job from a job_set on the calendar. If scheduler adds more than one job then scheduler will reserve those resources and likely not run some other job which it could have. I get your point too, I will take this off the document

I’ll make this change in the document. It would really make things easy if we delete jobs as soon as one starts. I’ll keep this part of the change as “Experimental” since it can change based on feedback.

I’ll change the option to -Wjob_set that seems more readable. Subhasis had a similar question like you have of submitting a job where the job-id pointed by “job_set” option isn’t a job_set leader. I forgot to write this up, but, I think PBS server can just reject this job request. Users must know the job_set they are submitting to. What do you think?
While I am writing this, I think we need a command to just list down all the job_sets if we want users to submit to the right job_set. Otherwise, PBS server should just accept it and move the job under the right job_set (which in your example would be 103)

Output of such a command will be one job id which will be the ID of the job_set leader. Regarding rejecting the request, internally server will throw an error but qsub will ignore the reject and move on to the next resource request. It can also print a message on stderr about why a resource request could not be submitted but that might break backward compatibility. Server, on the other hand, will surely log the reason of rejecting a job submission.

I wanted to keep it as nfilter because it signifies what it is going to filter. If we extend this filter mechanism to replace limits or queues, server we can then call it as jfilter. It’s because based on the prefix “n” or “j” the whole input that is going to be passed to the filter can be easily interpreted.

It isn’t same as job_sort_formula syntax. The reason is that formula just works on the resources requested by the job, so if the formula is like this “ncpus + 2 * mem” it is safe to assume that user is talking about resources requested by the job. In this case, we are exposing resources_available and resources_assigned on the nodes to the users. Both of these can have the same resource name, so we need a way to distinguish them.
Implementation wise this way of specifying filter can be easily interpreted in python if we expose two dicts (resources_assigned and resources_available) to it.

Well, my opinion is why to take a different direction in accounting too. We can probably log job_set information is a job is part of a job_set but other than that it will just look like a bunch of jobs were submitted, one of them ran and others were deleted. This could happen in any normal day-to-day accounting logs too.
exposing job_set information in accounting record will give post processing tools a way to correlate things and make sense out of it.
What do you think?

Topic		Replies	Views
PP-662, PP-663: UCR and External Interface document for Reservation enhancements Developers	91	5124	September 13, 2017
Theoretical PBS Scheduler/Server Limits Users/Site Administrators	3	623	January 19, 2022
Dynamic ncpus/nodes/ppn specification Users/Site Administrators	1	1307	January 12, 2018
PP-877: UCR discussion for hyper-threading support in PBS Developers	5	1624	August 2, 2017
[WIP] "Mock run" option for scheduler Developers	6	1211	November 17, 2020

PP-506,PP-507: Add support for requesting resources with logical 'or' and conditional operators

Related topics