We could extend the policy for craches as well, by adding a
minNumOwners setting and only triggering an automatic rehash when
there is a segment on the hash wheel with <= minNumOwners owners.
minNumOwners should be at least 1, which in the default use case (numOwners==2) would mean
that we react to every leave.
Whilst this functinality is very nice, I don't think it should have a high prio as
most of the use cases I'm aware of use numOwners=2.
We would have a properly installed CH, that guarantees at some point
in the past each key had numOwners owners, and a filter on top of it
that removes any leavers from the result of DM.locate(),
DM.getPrimaryLocation() etc.
It would probably undo our recent optimizations around locate and
getPrimaryLocation, so it's slowing the normal case (without any
leavers) in order to make the exceptional case (organized shutdown or
a part of the cluster) faster.
The question is how big the cluster
has
to get before the exceptional case becomes common enough that it's
worth optimizing for...
re:partial shutdown, due to consistency constraints, it
won't be posible to controlled shutdow more than numOwners-1 nodes at any time, so not
sure it this optimisation has a broad scope.
For total shutdown, I guess we can use other means that rehash, e.g. a specific command
that would disable it and start flushing to the store.