[infinispan-dev] Re: Distributed hashing - heterogenous nodes

Thu Jul 9 04:53:06 EDT 2009

Ups, just spotted Manik had replied already! :)

On 07/07/2009 02:16 PM, Manik Surtani wrote:
> Hi Alex
>
> Yeah the DIST code is pretty fledgling and a lot of it does need to be
> sorted out. I hope to spend a lot of time on it this week.
>
> Thanks for your willing to contribute a better CH impl. I agree with
> your points below. Perhaps you could directly email me a patch which I
> can look at?
>
> Cheers
> Manik
>
> On 23 Jun 2009, at 19:47, Alex Kluge wrote:
>
>>
>> Hi,
>>
>> I have been working on a use of Jboss cache, which has a lot of
>> overlap with the Infinispan project. I expect to be able to employ
>> significant parts of this work in Infinispan as well. One point of
>> overlap is the use of a consistent hash.
>>
>> I have looked at the
>> org.infinispan.distribution.DefaultConsistentHash, and this is
>> actually a simple hash, and not a consistent hash. Luckily I have a
>> version of a consistent hash that can almost be dropped in here. There
>> are a number of properties of a consistent hash that make it valuable
>> for a distributed hash table.
>>
>> - If a server is removed, the number of keys that shift to a different
>> bin (different server) is minimal.
>> - The same key is always mapped to the same server.
>> - If a server is added, the number of keys that shift is minimal.
>>
>> The current DefaultConsistentHash doesn't deliver on these. I hope you
>> don't mind if I go into some details here.
>>
>> For example, the hash is sensitive to the order in which servers
>> appear in the initial collection of caches. If one cache is built with
>> a list of servers (S1, S2, S3), and another is built with a list (S3,
>> S2, S1), keys will be mapped to different servers, even though the set
>> of servers is actually the same.
>>
>> If one server is removed, many, or even all, keys will be shifted. For
>> example one hash with the set of servers (S1, S2, S3) will map many
>> keys to different servers than one with (S2, S3). In a true consistent
>> hash, the keys originally mapped to S2 will remain mapped to S2, and
>> those mapped to S3 will remain mapped to S3. The keys that were mapped
>> to S1 will (depending on the exact implementation) will be divided
>> between S1 and S2.
>>
>> There are a few differences, specifically, I work with arrays rather
>> than collections – in part for performance, I also support weights for
>> the servers, and the replication count is an instance variable rather
>> than an argument to the locate method. How wedded are you to supplying
>> the replication count as part of the locate method? Other than this,
>> it looks like an adaptation of my implementation to Infinispan would
>> be fairly painless, and I suggest replacing the current implementation
>> with it.
>>
>> Alex Kluge
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> --
> Manik Surtani
> manik at jboss.org
> Lead, Infinispan
> Lead, JBoss Cache
> http://www.infinispan.org
> http://www.jbosscache.org
>
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

-- 
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache