I’m closing in on finishing my implementation of the feature. I had to change the design a little. I found out that it is impossible to reconfirm in-conflict reservations after they run. When reconfirming a running reservation, you keep the nodes that it has and replace the nodes that are down. The problem is that in-conflict reservation’s resv_nodes is set to the list of nodes they have. The conflicted nodes were removed. It’s too hard to map the select statement to the nodes that are left.
Also to help with testing, I changed the attributes a little. Instead of having reserve_retry_init which is time the first reconfirmation attempt is made, I have resv_retry_time. This attribute is the time between attempts. Now reserve_retry_init is deprecated. The first attempt will be made after reserve_retry_time seconds after the reservation is first degraded.