PP-506,PP-507: Add support for requesting resources with logical 'or' and conditional operators

I just wanted to minimize the overhead of having more than one job from a job_set on the calendar. If scheduler adds more than one job then scheduler will reserve those resources and likely not run some other job which it could have. I get your point too, I will take this off the document

I’ll make this change in the document. It would really make things easy if we delete jobs as soon as one starts. I’ll keep this part of the change as “Experimental” since it can change based on feedback.

I’ll change the option to -Wjob_set that seems more readable. Subhasis had a similar question like you have of submitting a job where the job-id pointed by “job_set” option isn’t a job_set leader. I forgot to write this up, but, I think PBS server can just reject this job request. Users must know the job_set they are submitting to. What do you think?
While I am writing this, I think we need a command to just list down all the job_sets if we want users to submit to the right job_set. Otherwise, PBS server should just accept it and move the job under the right job_set (which in your example would be 103)

Output of such a command will be one job id which will be the ID of the job_set leader. Regarding rejecting the request, internally server will throw an error but qsub will ignore the reject and move on to the next resource request. It can also print a message on stderr about why a resource request could not be submitted but that might break backward compatibility. Server, on the other hand, will surely log the reason of rejecting a job submission.

I wanted to keep it as nfilter because it signifies what it is going to filter. If we extend this filter mechanism to replace limits or queues, server we can then call it as jfilter. It’s because based on the prefix “n” or “j” the whole input that is going to be passed to the filter can be easily interpreted.

It isn’t same as job_sort_formula syntax. The reason is that formula just works on the resources requested by the job, so if the formula is like this “ncpus + 2 * mem” it is safe to assume that user is talking about resources requested by the job. In this case, we are exposing resources_available and resources_assigned on the nodes to the users. Both of these can have the same resource name, so we need a way to distinguish them.
Implementation wise this way of specifying filter can be easily interpreted in python if we expose two dicts (resources_assigned and resources_available) to it.

Well, my opinion is why to take a different direction in accounting too. We can probably log job_set information is a job is part of a job_set but other than that it will just look like a bunch of jobs were submitted, one of them ran and others were deleted. This could happen in any normal day-to-day accounting logs too.
exposing job_set information in accounting record will give post processing tools a way to correlate things and make sense out of it.
What do you think?

Bill had similar question too. I should have added something related to this to the document. I’m thinking to reject a job would be the right thing to do for PBS. But if we do so, there should be a way users can list down all the job_sets too.
What do you think?

I wasn’t planning on supporting a complex nested expression. But, it is going to be interpreted using python interpreter itself and I guess that does not have a limitation on a complex expression. So yes, as long as the expression can be interpreted using a python interpreter, it can be a complex expression too.

I didn’t think about using an already existing states. If I borrow a state which is already getting used in another feature for years then I would have to worry a lot about breaking backward compatibility and maintaining semantics of what that state/substate means. Creating another substate is a lot of work but it gives us flexibility of doing something new and we don’t really have to worry about breaking backward compatibility.

1 Like

@billnitzberg, @subhasisb Thanks for your valuable comments, I’ll wait for a day for others to review the document before making changes.

Bill had similar question too. I should have added something related to this to the document. I’m thinking to reject a job would be the right thing to do for PBS. But if we do so, there should be a way users can list down all the job_sets too.
What do you think?

I think rejecting would be troublesome for users - as you said, they would then need a list to know job sets. Instead we can make it transparent. Ie, you can give any jobid as the job-set id and pbs will silently figure things out. The only case we should reject if something in the job-set is already running.

I’m okay to go in that direction too. It’s just that since user was already under a misconception about the job_set leader, it shouldn’t happen that they start thinking that this is a bug in PBS :slight_smile:

Well, in that mode, the job-set-leader id would not be very important - basically you just associate with any job-id that is part of a job-set already and PBS figures it out …the user will not have a misconception that way.

I’m not sure this is the direction we want to go in. This would be different than job arrays. When you requeue a job array, you requeue all the jobs in the array(including all in state X). If we are considering using job sets to replace job arrays in the future, this would make the two designs incompatible.

If we keep all the jobs like the design currently states, I think we should accept jobs to the set after the set is running. It’ll likely be deleted, but if the job set is requeued, it would become a viable job.

There is something else to think about when adding jobs in a jobset to the calendar. If we add more than one, we are making our calendar less accurate. We know only one of the jobs in the jobset will be run. By adding them all to the calendar, we take up space and push other calendared jobs out later in time. There is no real good answer to this issue. Just choosing the first one is probably the best answer. It’s the most deserving job of the set, so saving resources to get it to start running is good. We’re still not sure it is the one which will eventually run though. I think this is a better answer than adding them all though.

If you reject a request for requesting a non-job leader, I’d make it clear what you are doing. Say that the request is invalid because job is part of jobset

If we accept it and do the right thing then we’re basically giving a job set many names (every job in the set). We’d have to do this for all the commands as well. If the user submitted to jobset and is part of set and we added it, the user would be confused if they couldn’t act upon job set in the future.

One more thing to think about. Do we want to consider a job in multiple job sets in the future. If we do, I think we want to reject the request now.

One quick note: qselect now uses a long option (–job_set). getopt_long() is not supported on windows. Actually getopt() isn’t supported on windows. We have a version of it in Libwin. If you want this long option, you’ll have to get a copy of getopt_long() for Libwin as well.

Bhroam

1 Like

One thing to consider in this design as well is that we already have a job set (job array) with the run criteria of run all. We now are wanting to add a job set with the run_criteria of run one. There is also a third job set to consider for genomes or code breaking with a run_criteria of run all until one succeeds. I think if we add job sets then we need to be flexible to run these and more in the future. Maybe we add a new attribute called run_criteria and set it by default to run all for job arrays and run one for job request sets.

Also, in talking with Arun I realized that one requirement was not clear. For the user perspective they will only see one job id for the job they submitted and will be able to delete the job request set in a single command. Now I don’t have a requirement to allow them to change one of the job requests from the request set but if the team feels strongly that we should provide one using qalter then ok. However, since most users don’t use qalter, why not make it so that if you have to change the whole set of resource requests if you want to change one using a qalter.

@jon Can you please elaborate a little about the use case of seeing only one job id when the users submit a job and delete/alter them in a single command.

Would it be sufficient that if a user submits a job with multiple resource requests then output of the job submission is just the job-id of the first job submitted but if the users does a qstat, it shows up all the other jobs like any other normal job?

Would it be okay if users are allowed to perform an operation like qdel on a job-set by using the command in conjunction with ‘qselect --Wjob-set=’? This way user will be able to delete all the jobs in one command.
This way all other commands can also act on a job-set in one single command like they do on any other list of job-ids.

In the interest of time, Bill’s proposal is to separate design proposal of specifying multiple resource requests and to support conditional operators.

I’m going to separate them out and take the design proposal for PP-507 (nfilters) in a different document. Please let me know if you think that we shouldn’t be separating the design proposals.

Thanks!

From the user perspective,

  • I submit a single job from the command line or CM
  • the admin modifies the job in a hook or submission script to have additional resource requests.
  • I get a single job id back
  • it doesn’t start because another job id in the job set that they knew nothing about runs.
  • I delete the job because I am confused
  • I look in the submission directory and I see output files.

Having multiple job ids for a single job request (not resource request) is a bad idea from a user perspective. And to have to train users to use qselect is also a lot to ask admins to train their CLI user base not to mention the additional documentation to inform the users about how everything changes if they want to work with job sets.

I’m pretty much in agreement with Jon here - the creation of loads of phantom job ids is going to be confusing to both end users and administrators.

And I’m still concerned about the impact on server performance. Yes, I know we’re planning on some significant improvements in this area but those aren’t scheduled until well after boolean resources are scheduled to be delivered (and, since they haven’t been implemented yet, we’re not sure how significant an improvement they’re going to produce) - so we’re basically talking about wrecking server performance for at least a year or so after the feature is implemented.

That seems like a bad idea.

Thanks Jon for your inputs. I’m trying to understand the use case here -

Does the user expects the job-id he/she received from submission to be in running state? Would it help if user has some way of knowing that one of the resource request ran as a separate job? What if the job-id received after submission has a comment that says “Job held, job running from this job set”.

How about passing a special parameter “–job-set” with commands like qstat, qdel, qsig, qhold etc and then give any job-id which is part of a job-set. This will result into action been taken on the whole job-set (not only on the specified job-id)?

There is an additional functionality requirement embedded in @jon’s example that I don’t think has been considered (or mentioned) to date:

  • the ability for a submission hook to add resource alternatives to an existing request

How important is that functionality? Could someone describe the use case (not the implementation, but the use case from the user/admin point of view without referencing PBS Pro, the goal they are trying to achieve by having this functionality)?


Separately, the UI can provide a single job id for the collection while also allowing multiple ids, one for each resource request. For example, imagine if PBS Pro provided individual job ids for every subjob of an array job – the user could still uses a single job id, e.g., 124[] or even 124 to refer to the collection.

https://www.rubegoldberg.com/artwork/how-to-get-rid-of-a-mouse-2/?c=45

There has been discussion about how does a hook parse the resource request. So the extension of that is how does the hook writer alter/append/reject that resource request. An application of this is that if I am doing allocation management how do I reserve allocation for a job that is part of the “resource request set” and how will a submission hook know that if the server hasn’t seen the job?

As for the use case, I have a heterogeneous cluster and I want all jobs to use whole nodes. I know what my cluster has so I change the initial job resource request to a set of acceptable “resource requests” to ensure only whole nodes will be allowed. The use cases are the same as discussed above but just using a hook. I see this as no difference from doing it from a hook or a submission web portal

Agreed. The user should only see one job id and be able to operate on a single job id. If we use the job array syntax then we will need a way to distinguish between job arrays and resource requests for qstat (i.e. qstat -t should only show job sets and not “the set of resource requests”. As for qalter, I still think that it should be required to change all requests if a user/admin are not happy with an individual resource request.

Yes. Also submission portals also expect that the job id returned is the one that is run and exits before they continue to the next step, whatever that may be.

It may but as a user I would not like to see “Job held, job running from this job set”. My first response is what is a job set if I did not request one. The second is that I would call the admin and say what does this mean. And then when the stdio and stderr file came back with different job ids which one do I look at. It just adds more confusion. Also, if the application was specifically looking for the stdio/stderr files to see that the job has continued, now what does it do? Do we cause sites to rewrite those as well with this individual job id per resource request?

I don’t think a special parameter “–job-set” is the right way to go. We don’t require this anywhere else (except when you submit a job dependency) so introducing it for this seems like the wrong thing to do from the user perspective.