Hi Dan,
<snip>
But unfortunately the partial shutdown case does NOT work, if the coordinator node is a
member of partial shutdown nodes.
When the coordinator is down while suspending rebalance, one of the rest servers is
becoming a new coordinator with rebalanceEnabled=true, and after that, the rebalance
process is starting.
Once this happened, it will cause the unexpected cache data accumulation on the nodes that
are awaiting shutdown request.
</snip>
Seems like this scenario is not covered. I guess we should broadcast (to all the members)
the clean shutdown request so that the new coordinator would know pick it up and don't
start the rebalance?
Cheers,
--
Mircea Markus
Infinispan lead (
www.infinispan.org)