PP-662, PP-663: UCR and External Interface document for Reservation enhancements

Hi Ravi,

just wondering if this information should be a part of server log or the accounting logs. I am inclined towards server logs as this information (so far) doesnt change with every instance. Thinking of it we can add it only to the ‘U’ record, but as the recurrence rule contains ‘=’ and ‘;’ in its value, it would need PP-464 to be resolved ahead of this RFE.

Thanks,
Prakash

Hi Ravi,

‘Y’ record is written everytime a reservation is confirmed (minus PP-821) and not for every instance of a standing reservation.

I think I was not clear. For a standing reservation, the scheduler returns the complete list of exec_vnodes for all occurrences of the reservation to the server (as each occurrence can possibly run on different set of nodes). The server stores this information inside an internal reservation attribute “reserve_execvnodes”. So, when a standing reservation gets confirmed, will the Y record print out “reserve_execvnodes” or just the exec_vnode of the first/next occurrence of the standing reservation?

Ravi,

As per the requirements we can change only the current instance (for an already running reservation) or the next instance of a standing reservation. So, it would only be the nodes for the instance that was altered.

Thanks,
Prakash

Ravi,

I have added interface 11, which is a new server log that will be recorded whenever a new reservation is submitted. This will have the information on the recurrence rule and the timezone for a standing reservation. This will help in debugging an already completed/deleted standing reservation. I am of the view that this information does not belong to the accounting logs.

Interface 12 is also there as a counterpart of Interface 11.

Thanks,
Prakash

Ok, but in Interface 8, you also mention that when a standing reservation is confirmed for the first time, it’ll still get “nodes=()” as one of the fields in the Y record. So, will that be the exec_vnode of the first occurrence, or the complete list? In either case, I think it’d be helpful to clarify that in the EDD.

I think the server already logs creation and deletion of reservations in the server logs, although the log message for reservation creation could be more descriptive. A log message for confirmation should help, right now the logs just say the reservation queue was enabled.

About recurrence rule and timezone not belonging in accounting logs, sorry for being stubborn about this, but why do you think so? I’d mentioned debugging just as an example of how this can be useful to have. If we consider the purpose of writing accounting records for reservations in the first place, presumably, one would look at reservation related information from accounting logs to figure out what resources are reserved by it, right? In that case, wouldn’t it also be useful for them to know how often this reservation will consume resources from the cluster? If yes, then why do we want to make them open up server logs to get that information? Wouldn’t it be useful to just have it right there in the accounting logs itself?

the difference between ‘Y’ log recorded for the first confirmation and the one recorded later is the index field. Also, the field name is nodes and not exec_vnodes, values of both have different formats, so it should be evident that the field contains only the nodes for the first instance. I do not see a need of being specific here, but, it if it helps, I have added this information to the description.

The logs that you see are for the reservation queue and not for the reservation itself.

I think so because to find out the resources that were used by the reservation, we already have all the information in the ‘B’ record. It is not necessary that all the instances of a standing reservation would begin running. A user may delete the reservation after a few instances are run. It is more reliable to look at the ‘B’ records then to look at the recurrence rule. We also have the ‘count’ field in the ‘Y’ record to find out how many instances were requested and confirmed. So, I am not asking the user to look at the server logs to get information on how often this reservation will consume resources. All the information is already there in the accounting logs. I have added recurrence rule and the timezone information to the server logs so that this information is available somewhere if there is a need.

Thanks for the changes @prakashcv13. My only suggestion is a minor change to the new server logs in interface 12. How about using the id field of the log instead of putting the reservation name in the log string itself?

Thank you @bhroam. It makes sense to use the ID field of the logs. I have updated all the server logs to use event type of reservation. Please let me know your thoughts on this.

Thanks,
Prakash

Thanks for making the change(and all the others) @prakashcv13. I’m happy with the document. I’m excited to have this functionality in PBS.

I will consider that as a sign-off.

coming soon :slight_smile:

I like the changes to the Y record, the EDD looks good!

One more suggestion on the Y record… please consider adding in the requested information from PP-703 “Enhance accounting logs with consistent and complete information about reservations”. In particular, having both the recurrence rule and timezone are necessary to understand what was actually requested by the user and what was executed by the scheduler.

Hi @billnitzberg,

I have proposed a server log that will have this information in Interface 11, wouldn’t that suffice?
Also, I would need to understand how this information will be useful in the accounting logs.

If we do have customers (internal or external) who need this information in the accounting logs, I would like to put this information in the ‘U’ record instead of the ‘Y’ record. ‘Y’ record is for the confirmation and there can now be multiple ‘Y’ records. As we are not changing the recurrence rule in this RFE, it will remain same throughout the time period the reservation is in the system, so it makes sense to record it only once (if we do decide to record it in the accounting logs) in the ‘U’ record.

Thanks,
Prakash

First, I think it’s really important to treat Altair as just one of many contributors to the PBS Pro Open
Source Project, so Altair should not get any special treatment for “internal” use cases. We should always be focused on making PBS Pro better for the the whole community.

Historically, the accounting records have primarily been used for extracting resource use data for billing (charge-back, reporting, allocations, etc). Originally, this only meant “resources actually consumed by jobs”, but now includes the lost opportunity costs of resources not available for other jobs (e.g., a reservation may prevent some jobs from starting, which is an opportunity cost that some sites want to consider in billing and reporting). Having a full workload trace (not only what is consumed, but also what is requested, and when) is really important for determining opportunity costs. Further, a secondary use of the accounting records is for debugging, troubleshooting, and simulation – and for these use cases, getting the full workload trace is required (and there are multiple sites who use accounting logs for this secondary purpose, including Altair).

Note: the daemons logs are not a good substitute for capturing this data, as it is common for sites to keep only a few months (or days) of daemon logs, but sites generally keep accounting logs indefinitely. (Plus, the accounting logs tend to be a more stable interface.)

If it makes more sense to add it to the U records (instead of the Y record), that’s fine. I suggest checking with @agrawalravi90 who filed PP-703 to understand more deeply what needs to be captured (or what is not already captured today).

Thx!

Hi @billnitzberg,

Thank you for making me understand the requirement. I was not aware that opportunity costs are important. I have added this information to the ‘U’ record and updated the External Design.

Thanks,
Prakash

1 Like

It seems to make sense to add it to the U record as it is not information that would be modified by a pbs_ralter command. I like the update.

Makes sense to me. It comes in as part of the request, might as well put it in the record of the request (U).

Bhroam

Thanks!

Is the TZ also needed for advance reservations (in order to be able to re-construct when they will take place, e.g., in case they are taking place in 9 months)? If yes, I suggest either just adding TZ to the U record for advance reservations as part of this enhancement, or ensuring an RFE is filed so we don’t forget it’s needed.

Thanks again!