I’ve updated the design document. Placement sets are now supported. If they are in use, each placement set has its own group of node buckets. The algorithm is run N times, one for each placement set.
There are a couple new restrictions. The algorithm can’t be used if the job is suspended or checkpointed. When jobs are suspended or checkpointed, we create a special select statement for the job. Each chunk in the select statement has vnode=vn to make sure we place the job back on the resources it was originally running on. There is already a restriction for select=vnode jobs, this is just a special case of that restriction.
The other new restriction is the algorithm can not be used on complexes with multi-vnoded hosts. A job can request a large chunk where the resources are spread across multiple vnodes of a single host. The bucket algorithm can not do this resource spreading. It can’t determine if chunks require their resources spread across multiple vnodes.