[infinispan-dev] Re: Distributed hashing - heterogenous nodes

Tue Jul 7 08:16:04 EDT 2009

Hi Alex

Yeah the DIST code is pretty fledgling and a lot of it does need to be  
sorted out.  I hope to spend a lot of time on it this week.

Thanks for your willing to contribute a better CH impl.  I agree with  
your points below.  Perhaps you could directly email me a patch which  
I can look at?

Cheers
Manik

On 23 Jun 2009, at 19:47, Alex Kluge wrote:

>
> Hi,
>
>   I have been working on a use of Jboss cache, which has a lot of  
> overlap with the Infinispan project. I expect to be able to employ  
> significant parts of this work in Infinispan as well. One point of  
> overlap is the use of a consistent hash.
>
> I have looked at the  
> org.infinispan.distribution.DefaultConsistentHash, and this is  
> actually a simple hash, and not a consistent hash. Luckily I have a  
> version of a consistent hash that can almost be dropped in here.  
> There are a number of properties of a consistent hash that make it  
> valuable for a distributed hash table.
>
>  - If a server is removed, the number of keys that shift to a  
> different
>    bin (different server) is minimal.
>  - The same key is always mapped to the same server.
>  - If a server is added, the number of keys that shift is minimal.
> 	
> The current DefaultConsistentHash doesn't deliver on these. I hope  
> you don't mind if I go into some details here.
>
>   For example, the hash is sensitive to the order in which servers  
> appear in the initial collection of caches. If one cache is built  
> with a list of servers (S1, S2, S3), and another is built with a  
> list (S3, S2, S1), keys will be mapped to different servers, even  
> though the set of servers is actually the same.
>
> If one server is removed, many, or even all, keys will be shifted.  
> For example one hash with the set of servers (S1, S2, S3) will map  
> many keys to different servers than one with (S2, S3). In a true  
> consistent hash, the keys originally mapped to S2 will remain mapped  
> to S2, and those mapped to S3 will remain mapped to S3. The keys  
> that were mapped to S1 will (depending on the exact implementation)  
> will be divided between S1 and S2.
>
>  There are a few differences, specifically, I work with arrays  
> rather than collections – in part for performance, I also support  
> weights for the servers, and the replication count is an instance  
> variable rather than an argument to the locate method. How wedded  
> are you to supplying the replication count as part of the locate  
> method? Other than this, it looks like an adaptation of my  
> implementation to Infinispan would be fairly painless, and I suggest  
> replacing the current implementation with it.
>
>                                        Alex Kluge
>
>
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Manik Surtani
manik at jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org