Exceeded resources notification

Hi,

As discussed here, I am providing a new design doc. The purpose is to convey information of exceeding resources to the user by email and by setting job comment.

In the design document, I suggest adding a new exit code for each possible resource exceeding. It could be generalized and we could use only one exit code just for general “exceeded resource”. Then the email would be general and the comment too. I would prefer to be more precise.

Please, share your thoughts. Thank you,
Vaclav

Thanks @vchlum - i like it overall, however, i think you do not need to add to the substates of the job. We could simply use one of JOB_SUBSTATE_TERMINATED, or JOB_SUBSTATE_FAILED.

The substate along with your new exit_codes should be enough to tell anybody about what happened, so why put the same information in the substate as well. PBS server code often deals with the various substates in very specific ways, and so that code will need very careful changes in case we add to the substates list…so, it might be best to avoid it anyway.

You are right @subhasisb. We don’t need the substates. I thought we could use the new substates to update the job comment, and also just keep the information. It is true, that the exit code is sufficient anyway. I removed substates from the design document.

I think JOB_SUBSTATE_FAILED is suitable for this purpose.