Understanding Array Jobs

Hi All,
I am working on PP-315, and I have few questions regarding JobArray, before working
on the fix for this bug I need more understanding as well what would be the best approach.

Q1. Why deleted subjobs and finished subjobs are treated as expired, shouldn’t they be different ?

Q2. Why a Jobarray can not be moved to a different queue when it is in “begun” state while all the subjobs are either in expired or queued state. i.e none of the subjobs are running. ?

Q3. As far as I understand there are no restriction on the order of subjobs to be selected or the nodes/vnodes in which subjobs should run. Hence I don’t understand why then moving a Jobarray shouldn’t be valid when none of the subjobs are running ( which includes subjobs to be in queued or expired state)

Q4. When a subjob goes into execution, Parent job state changes to ‘B’, and if for some reason the running jobs gets requed, the Parent job still reamins in ‘B’ state even though all the subjobs are in 'Q’queued state. Is this behavior correct?

Also if I put hold on the arrayjob parent it will goto ‘H’ state from ‘B’ and when I release the parent job it goes to ‘Q’ instead of ‘B’. If the above behavior is correct then shouldn’t parent job move to ‘B’ state.

Hey Dilip,

You make a good point here. They probably should be different. A different question is what should happen if you delete a subjob and then requeue the job array? Should the subjob come back? That is what happens today (Since it’s treated as expired). Another idea is to treat it as a normal job. In that case the job is deleted and gone. Another question is what does separating out the expired from deleted subjobs buy us? While I think they are distinct states, if it buys us nothing, I don’t see a reason to do it.

You make a good point. My only worry is it might be confusing if a job array in state ‘B’ can sometimes move and sometimes not. I think it’d be wiser if we made it one way or the other and didn’t have this different ‘in state B but no subjobs running’ state. I’d leave it alone.

I’m not sure I see the conclusion you made. Why would the order in which the subjobs are run or where they are run have anything to do with moving the job array?

If you requeue the whole array, then it will return to the Q state. The question here is if you requeue all subjobs is that equal to requeuing the entire array. I’d say yes.

This sounds like a bug to me.

Thanks for coming up with these thought provoking questions!

Bhroam

Given the current design it does make sense to have them as the same state either deleted or completed. Either way, they are “done”. When we requeue a job array, we probably mean to say run everything in the job array once more (since its probably a co-ordinated parameterized job). Design wise, job arrays were built to scale wider, so states are computed with simple counts instead of iterating over every subjob. Question is, do we need to have the same behavior of subjobs as compared to normal jobs? If so, why would a customer simply not submit a number of normal jobs in a loop?

The whole idea of subjobs (I suspect) was to support large parameterized jobs. So they all need to run on the same set of nodes where they are expected to find the data to work on (based on the index of the subjob). If some of the subjobs can move to another host or just another queue where they could land on a set of nodes such that the shared data is not available, then that wont do much good to the parameterized job.

Hi Bhroam

Is that a correct behavior? If a subjob is deleted then the user intended to remove it and it shouldn’t revive on re-queue.

As of now the state of a parent array job is calculated based on the count of subjobs in queued state, if they are equal to what the no. of subjobs then parent job is set to “Q” otherwise “B”. Hence even deleting a subjob can cause a Array job to move from “Q” to “B”, which is the bug I am working on PP-315. Hence I thought of considering the count of deleted jobs, but as of now it can not be done, since both deleted and completed jobs are in same state. Also not all the states of subjobs are saved in database and hence restarting the server resets the subjobs to either expired or queued.

If subjobs were suppose to run on same set of nodes, than moving it different queue could affect where they will run. Hence moving array job wouldn’t be correct.

What is the bug here, that array job was in “B” when all the subjobs were queued or when it
went to “Q” when the hold was released

Hey Dilip/Subhasis,

Subhasis’s question is a good one. I think it boils down to should the job array as a whole be analogous to a signal job (with many parts) or should a subjob be analogous to a normal job. This goes to Dilip’s question as well. If we treat a job array as a normal job with many parts, requeuing it should bring back all deleted subjobs. The user wants to restart their job from the beginning. If we treat a signal subjob as a normal job, then the subjobs shouldn’t come back.

My opinion is to treat the whole job array like a normal job with many parts.

OK I understand now. Having the deleted state would keep a job array in the ‘Q’ state until a subjob has started. I guess it would also show up in qstat. It’d be good to show the user which subjobs were deleted and which subjobs have completed.

I see what you are getting at. From our definition, subjobs don’t need to run on the same set of nodes. It might help with a parameter study, but we don’t guarantee it. I don’t see that as a clear reason to stop a job array from moving once it is in the ‘B’ state. What does convince me to not allow the array to move is queue defaults. One of the definitions of a job array is that all subjobs are identical. If we run some subjobs in one queue and then the job array moves to another queue, the queue defaults might change. This would mean the running/expired subjobs will be different from the current queued subjobs.

Bhroam

@subhasisb, rebooting an old discussion. The statement and the current design presumes that the user intuitively thinks that the deleted sub job should come back on rerun.
How far is that assumption correct in the field. My intuition was that the deleted sub job will not come back.

The documented purpose of job arrays is for things like parametric sweeps, etc. From Section 9.1 in the User Guide:

PBS provides job arrays, which are useful for collections of almost-identical jobs. Each job in a job array is called a “subjob”. Subjobs are scheduled and treated just like normal jobs, with the exceptions noted in this chapter. You can group closely related work into a set so that you can submit, query, modify, and display the set as a unit. Job arrays are useful where you want to run the same program over and over on different input files. PBS can process a job array more efficiently than it can the same number of individual normal jobs. Job arrays are suited for SIMD operations, for example, parameter sweep applications, rendering in media and entertainment, EDA simulations, and forex (historical data).

Job arrays are treated as a single unit. Changing the behavior of subjobs when the arrayjob is requeued/rerun is beyond the scope of this bug fix.