Re: [infinispan-dev] Distribution, take 2

Monday, 20 July 2009

On 20 Jul 2009, at 16:01, Bela Ban wrote:

...

 Manik Surtani wrote:
>
> On 20 Jul 2009, at 14:51, Bela Ban wrote:
>
>> General comments:
>> - It seems that this is a mechanism where you replicate to the  
>> repl_count instances 'next' to you, ie. similar to Buddy  
>> Replication ?
>
> Yes.  So if you consider the hash space on a clockwise-incrementing  
> wheel [1], nodes {A, B, C, D} each have fixed positions on this  
> wheel (perhaps by making use of the hash codes on their  
> Addresses).  So this becomes deterministic.  Any key (K1 ~ K4 in  
> the diagram in [1]) can be located by ascertaining their place on  
> the hash wheel, moving clockwise, and considering the first  
> "repl_count" nodes encountered.

 OK

>
>> - Are you tying your rebalancing mechanism to the consistent hash  
>> implementation ? This would IMO be bad because it doesn't allow  
>> for plugging in of a different CH algorithm !
>
> Unfortunately the steps I have outlined below does imply some  
> knowledge of the hash algorithm, to determine which nodes are  
> affected by a LEAVE event, to minimise rebalancing.  Perhaps this  
> can be provided by the ConsistentHash interface, so that any new  
> implementation would need to provide a list of affected nodes when  
> a give node leaves.

 If this is the case, then I'd make sure that both the CH  
 implementation *and* the rebalancing policy are pluggable, so you  
 could write an instance of both and use it.

 For example, I can think of CH implementations that have  
 'geographical' knowledge, e.g. hosts on which nodes are running and  
 try to store keys on nodes which are as 'far' apart from each other  
 as possible, similar to what you did with Buddy Replication where  
 you make sure the buddies are on different hosts, or not on blades  
 within the same rack. 
Yes, good point.

...
>> What does 'installing a consisten hash instance' mean
?
>
> Literally, it means swapping out the CH instance used on a given  
> node with an alternate one.  A CH instance encapsulates a "node  
> list" and the logic to locate a key within the node list, using  
> hash functions to determine which nodes would hold a given key.

 Ah, so I assume you're sending a new node list, but don't swap out  
 the CH *logic* which is always the same, right ? 
Yes.  The DefaultCH would just serialize the member list.  Other CH  
impls may contain additional data.

...
  See this interface here [2], and a default (if currently imperfect) 

 implementation [3].  CHs are immutable so when a new view is  
 available, a new CH instance with a new node list should be created.
>
> In addition, aggregate CHs are allowed, to consider the union of  
> two different views.  This is represented by a delegating CH impl,  
> and is used in step 5.2 when a JoinTask is kicked off on a new  
> joiner.
>
>>>    4.5. GetConsistentHashCommand - an RPC command that "asks" a  
>>> node to serialize and transmit across its current CH impl.
>>
>> Serialize what ? The state (I assume) ? Or the consistent hash ?  
>> How can you serialize a CH in the latter case ?
>
> This refers to the CH impl.  The CH impl that needs to be  
> serialized would typically be an instance of [3].  Since the CH  
> impl just contains a List of Addresses, this should be possible.   
> This is necessary so that the joiner knows the state of the cluster  
> before it joined, and is able to ask specific members for state.
>
> Another option to this is to assume that the previous state is the  
> same as the new state, minus the new joiner.  Saves on RPC calls.

 OK, so why don't you simply use the View ? Or do you actually, on a  
 view change, call setCaches() with the list of addresses you got  
 from the View ? 
I can do that on the existing nodes.  On the new node (the joiner) I  
need to get a hold of the previous view before the joiner came up.   
But like I said above, perhaps this can be deduced.

Cheers
--
Manik Surtani
manik(a)jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] Distribution, take 2