I’m opening this forum discussion to discuss about a design change proposed here.
The change is about introducing an option for users to specify maximum array subjobs that can be concurrently running at any given time.
I’d explicitly say the %num comes at the end. You wouldn’t want someone to get confused and think they could do -J1-4%2,10-20%4
Consider not bothering adding the error message if both %num and -Wmax_run_subjobs are both given. Just say which one takes precedence.
I don’t think you can modify this in a runjob hook, since the runjob hook will be run on the subjob
Instead of setting a default of all the subjobs, just say if this attribute is not set, it will be backwards compatible with no limit. That way if someone wants to remove the limit with qalter, they don’t have to count up to total number of subjobs. They can just unset the attribute.
Just to mention a possible hiccup, I know that a qsub hook happens early in the jobs life. The max_run_subjobs might not be set by that time. If you set it in a qsub hook, it might clash (especially if you keep it as an error)
There is nothing you can really do about this, but if the max_run_subjobs attribute changes after the job is submitted, the Submit_Arguments will still have -J %num in it where %num is wrong. This isn’t anything new though.
Is it possible to specify -J option in comma-separated way like you described? I tried it but didn’t work for me. I was thinking to enforce user to do the right thing and now specify the attribute value in two different ways and that is why I didn’t do precedence. There really is no reason for users to specify it in two different ways.
You are right, I’ll make this change.
We cannot really unset a job attribute right? We can alter it, but I’ve not seen any interface to unset it. Am I missing something?
So I think queuejob hooks run before the job is enqueued but it still has the job object with user data in it. Otherwise, how will admins do some validity checks on the job and reject/accept or change it.
You are right, this problem will be there no matter what I do.
Thanks @arungrover! To add a very important use case to what you have listed in the design from the admin’s perspective:
A user may have a job array in which all subjobs will interface with a single instance of a shared data file, and as more and more subjobs run simultaneously the job performance sharply degrades. Further, different applications/or runs of the same application may have different impacts on the shared resource, so we have had requests for the ability to limit this number at the user’s request per job array.
Your current design covers this, I just want to make sure it is explicitly stated as a use case.
Thanks @scc I’ll mention this as the use case in the document. @scc I have a question about suspended subjobs. I think suspended subjobs should also be considered as running by PBS scheduler but I want to know your opinion on it from a use case perspective.
@bhroam and I had an offline chat and Bhroam said that job attributes can be unset using qalter -W<attr_name>=""
With this in mind, I think Bhroam’s comment is valid and I’ll make the change to say that if “% or max_run_subjobs” is not provided the behavior will remain same as it is today.
There seems to be a complexity with preemption. I was not planning on trying preemption for the array that couldn’t run because of this limit. Because the only job it can possibly consider to preempt is its own subjobs which are also running at the same preemption priority level.
But if a user does qrun of a subjob then I’m not sure what to do. We usually ignore all limits and go ahead with preemption in those cases. From a use case perspective should we allow that?
What about eligibile time? The general rule of thumb is if you are getting in your own way, you don’t accrue eligible time. That is usually because of a system wide limit. Here you are indeed getting in your own way, but I look at this feature as a voluntary limit to how many of your subjobs can run at one time. It doesn’t seem right for you to your place in line because you are doing this.
For normal preemption, we do not ignore limits. For qrun we do. For normal preemption, you won’t be able to preempt anything to run another subjob. All your subjobs will have the same preemption priority, so you can’t find anything to preempt. For a qrun preemption you have higher priority. Of course we ignore max_run limits with qrun, so we probably should do the same here. If an admin is telling us to run this job, we should bypass the limit and run it.
Agree on the limits/qrun discussion, thanks @bhroam.
I think the array should accrue eligible time if subjobs are not running due to this new limit. In the most common use case I believe these limits will be self applied, and I don’t believe the user should be punished for applying them by not accruing eligible time.
Thanks for catching that @scc. I’ve updated the document.
I’ve also changed the message scheduler logs when it hits the limit and a new error that gets printed when this new attribute is used for non-array jobs.