[infinispan-dev] Rebalancing flag as part of the CacheStatusResponse

Erik Salter an1310 at hotmail.com
Mon Oct 27 18:00:46 EDT 2014


Hi all,

This topic came up in a separate discussion with Mircea, and he suggested
I post something on the mailing list for a wider audience.

I have a business case where I need the value of the rebalancing flag read
by the joining nodes.  Let's say we have a TACH where we want our keys
striped across machines, racks, etc.  Due to how NBST works, if we start a
bunch of nodes on one side of the topology marker, we'rewill end up with
the case where all keys will dog-pile on the first node that joins before
being disseminated to the other nodes.  In other words, the first joining
node on the other side of the topology acts as a "pivot."  That's bad,
especially if the key is marked as DELTA_WRITE, where the receiving node
must pull the key from the readCH before applying the changelog.

So not only do we have a single choke-point, but it's made worse by the
initial burst of every write requiring numOwner threads for remote reads.

If we disable rebalancing and start up the nodes on the other side of the
topology, we can process this in a single view change.  But there's a
catch -- and this is the reason I added the state of the flag.  We've run
into a case where the current coordinator changed (crash or a MERGE) as
the other nodes are starting up.  And the new coordinator was elected from
the new side of the topology.  So we had two separate but balanced CHs on
both sides of the topology.  And data integrity went out the window.

Hence the flag.  Note also that this deployment requires the
awaitInitialTransfer flag to be false.

In a real production environment, this has saved me more times than I can
count.  Node failover/failback is now reasonably deterministic with a
simple operational procedure for our customer(s) to follow.


The question is whether this feature would be useful for the community.
Even with the new partition handling, I think this implementation is still
viable and may warrant inclusion into 7.0 (or 7.1).  What does the team
think?  I welcome any and all feedback.

Regards,

Erik Salter
Cisco Systems, SPVTG
(404) 317-0693




More information about the infinispan-dev mailing list