On 16 Apr 2009, at 08:02, Bela Ban wrote:
Thinking more about the wait time, this might be problematic:
* V1={A,B,C,D}
* K maps to C and D, repl-count=2
* V2={A,B,C}
* V3={A,B}
* If C and D crashed (say within 10 seconds), then we need to
rebalance K immediately upon receiving V2 because repl-count < 2
Why not handle views immediately and rebalance elements ? We can
always think of optimizations later...
That is true - we need to think about shrinking clusters as well as
growing ones.
Rebalancing should occur a lot when starting the cluster, but at
this time, we don't have many elements in the cache anyway. During
operation, views should be infrequent. And, if we pick a good
consistent hashing algorithm, rehashing should only affect 1/N of
all elements anyway.
Yes.
And remember: at the JGroups level, we can bundle views (= process
multiple LEAVE or JOIN requests together and generate only 1 view)
with GMS.view_bundling, this might help reducing churn without any
code changes, at least for the moment.
That's a good point.
Ok, I'll focus on immediate rehashing for now, and we'll look at wait
times as an optimisation.
Cheers
--
Manik Surtani
manik(a)jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org