Re: [infinispan-dev] Asymmetric caches and manual rehashing design

Thursday, 29 September 2011

On 9/28/11 1:48 PM, Dan Berindei wrote:
...
 On Wed, Sep 28, 2011 at 12:59 PM, Bela Ban&lt;bban(a)redhat.com&gt; 
wrote:
> My 5 cents: 
...
> - Are you clubbing (virtual) view updates and rebalancing
together ? And
> if so (I should probably read on first...), can't you have view
> installations *without* rebalancing ?
>

 I'm not sure what benefits you would get from joining the cache view
 but not receiving state compared to not sending the join request at
 all. Non-members are allowed to send/receive commands, so in the
 future we could even have separate "server" nodes (that join the cache
 view) and "client" nodes (joining the JGroups cluster to send
 commands, but not the cache view and so not holding any data except
 L1). 

I had the scenario in mind where you join 100 members and only *then* do 
a state transfer (rebalancing).

...
 My idea was that the cache view was a representation of the caches
 that are able to service requests, so it doesn't make sense to include
 in the view caches that don't hold data. 

OK. So with periodic rebalancing, you'd hold (virtual) views and state 
*until* the trigger fires, which then installs the new virtual views and 
rebalances the state ? In this case, tying view delivery and rebalancing 
together makes sense...

...
> - Do we need the complex PREPARE_VIEW / ROLLBACK_VIEW /
COMMIT_VIEW 2PC
> handling ? This adds a lot of complexity. Is it only used when we have a
> transactional cache ?
>

 Nope, this doesn't have anything to do with transactional caches,
 instead it is all about computing the owner that will push the key
 during the rebalance operation.

 In order to do it deterministically we need to have a common "last
 good consistent hash" for the last rebalance that finished
 successfully, and each node must determine if it should push a key or
 not based on that last good CH. 

OK. I just hope this makes sense for large clusters, as it is a 2PC, 
which doesn't scale to a larger number of nodes. I mean, we don't use 
FLUSH in large clusters for the same reason.
Hmm, on the upside, you don't run this algorithm a lot though, so maybe 
running it only a few times amortizes the cost of it.

With this algorithm, I assume you won't need the transitory view anymore 
(UnionConsistentHashFunction or whatever it was called), which includes 
both current and new owners of a key ?

...
 A rebalance operation can also fail for various reasons (e.g. the
 coordinator died). If that happens the new owners won't have all the
 state, so they should not receive requests for the state that they
 would have had in the pending CH. 

OK, fair enough

...
> - State is to be transferred *within* this 2PC time frame. Hmm,
again,
> this ties rebalancing and view installation together (see my argument
> above)...
>

 If view installation wasn't tied to state transfer then we'd have to
 keep yet the last rebalanced view somewhere else. We would hold the
 "last pending view" (pending rebalance, that is) in the
 CacheViewsManager and the "last rebalanced view" in another component,
 and that component would have it's own mechanism for synchronizing the
 "last rebalanced view" among cache members. So I think the 2PC
 approach in CacheViewsManager actually simplifies things. 

OK, agreed. I would not like this if it was run on every view 
installation, but since we're running it after a cooldown period, or 
after having received N JOIN requests or M LEAVE requests, I guess it 
should be fine. +1 on simplification

-- 
Bela Ban
Lead JGroups (http://www.jgroups.org)
JBoss / Red Hat

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] Asymmetric caches and manual rehashing design