[infinispan-dev] HotRod, ClientIntelligence and client-side key location

Galder Zamarreno galder at redhat.com
Tue Mar 23 03:48:00 EDT 2010


See below:

On Mon, 22 Mar 2010 13:02:15 +0100, Manik Surtani <manik at jboss.org> wrote:

>
> On 22 Mar 2010, at 11:16, Galder Zamarreno wrote:
>
>> See below:
>>
>> On Thu, 18 Mar 2010 22:26:44 +0100, Alex Kluge <java_kluge at yahoo.com>
>> wrote:
>>
>>> </snip>
>>>
>>>> Firstly, this is hard when consumed by non-Java clients as you'd need
>>> to
>>>> implement the way the JDK calculates the hash code of a byte array.
>>>
>>> This is a much easier problem if you don't use the built in hashing.
>>> There are
>>> a number of hash algorithms that can be used, including
>>> FNV 1 (http://www.isthe.com/chongo/tech/comp/fnv/)
>>> and others http://www.azillionmonkeys.com/qed/hash.html.
>>>
>>> These are implementable in other languages, are fast, and provide good
>>> distributions of
>>> results. We can use, and similarly document, an integer hash used to
>>> further spread the
>>> hash values if needed. If these are chosen to be implementable in
>>> multiple languages,
>>> clearly documented, and don't change too often, it should be reasonable
>>> to put them into
>>> the client.
>>>
>>> Using this approach removes the dependency on the hash that the VM
>>> happens to be
>>> using. Indeed, the hash for a byte array may simply be the address of
>>> the array, which
>>> makes it very poor for our use.
>>
>> This is a very good point. This could even be done internally. If your
>> starting Infinispan normally, use default CHA. If you're starting a Hot
>> Rod server, the implementation could inject one of these CHA which are
>> clearly documented and are easy to implement in other languages. That  
>> way
>> you'd get the best of both worlds. You don't need to expose your  
>> internal
>> details for normal Infinispan and we have a robust, stable and easy to
>> implement algorithm in other languages.
>
> On no, it shouldn't even be as complex as that.  What I suggest is this:
>
> DefaultConsistentHash currently uses a Wang/Jenkins hash as a  
> bit-spreader on key.hashcode(), and address.hashcode().
>
> This is poor, since this is JVM dependent.  Secondly (and for a  
> different reason) the W/J hash isn't providing us with adequate spread,  
> as we've found out.
>
> So step 1 is to identify a better-spread hash, some have been suggested  
> on this thread, preferably one that can operate directly on byte[]'s and  
> eliminate the need for a bit spreader.
>
> The next step would be to change DefaultConsistentHash to:
>
> * For addresses, use ${HASH_FUNCTION} on address.hashcode()
> * For keys which are byte[]s, use ${HASH_FUNCTION} directly on the  
> byte[] (this would directly benefit use via HotRod)
> * For keys which are Strings, use ${HASH_FUNCTION} directly on the  
> String (this is an optimisation)
> * For keys which are Objects, use ${HASH_FUNCTION} on object.hashcode()  
> (for in-VM use)

Yeah, this would much simpler to implement and maintain, hence +1

>
> We would need to document ${HASH_FUNCTION} as a part of the HotRod  
> protocol, and to successfully locate entries, clients would need the  
> following info:
>
> * Server endpoints on the backend and their address.hashcode() values

We have that in the protocol.

> * Hash space size (the modulus for all modular arithmetic, hard-coded  
> for now, may change in future).  An int.

That would need adding to the response header, wouldn't it?  
http://community.jboss.org/wiki/HotRodProtocol#HashDistributionAware_Client_Topology_Change_Header

> * Hash function version.  This could point to details on the spec.  This  
> could be a short.

You've included that, good.

> * Num owners the servers have been configured to use.  This again could  
> be a short as far as HotRod is concerned.
>
> WDYT?
>
> Cheers
> Manik
>
>> Cheers,
>>
>>>                                                                    Alex
>>>
>>> --- On Thu, 3/18/10, Manik Surtani <manik at jboss.org> wrote:
>>>
>>>> From: Manik Surtani <manik at jboss.org>
>>>> Subject: [infinispan-dev] HotRod, ClientIntelligence and client-side
>>>> key location
>>>> To: "infinispan -Dev List" <infinispan-dev at lists.jboss.org>
>>>> Date: Thursday, March 18, 2010, 11:59 AM
>>>> I've been thinking about how we
>>>> handle this, and I think we have a problem with smart
>>>> clients where clients have the ability to locate the key on
>>>> the server cluster in order to direct the request to the
>>>> specific node.
>>>>
>>>> The problem is in hash code calculation.  The HotRod
>>>> protocol caters for this with regards to calculating node
>>>> address hash code by passing this in the topology map (see
>>>> "Hasher Client Topology Change Header" in [1]), but the only
>>>> way this can be meaningfully used is if the client has the
>>>> ability to calculate the hash code of the key in the same
>>>> manner the servers do.  Firstly, this is hard when
>>>> consumed by non-Java clients as you'd need to implement the
>>>> way the JDK calculates the hash code of a byte array.Second, you'd  
>>>> need
>>>> detailed and specific knowledge of any
>>>> bit spreading that takes place within Infinispan - and this
>>>> is internal implementation detail which may change from
>>>> release to release.
>>>>
>>>> So the way I see it I can't see how non-Java clients will
>>>> be able to locate keys and then direct requests to the
>>>> necessary nodes.  In fact, even with Java clients the
>>>> only way this could be done would be to send back marshalled
>>>> Addresses in the topology map, *and* have the same version
>>>> of the Infinispan server libs installed on the client, *and*
>>>> ensure that the same JDK/JVM version is used on the
>>>> client.
>>>> Can we think of a better way to do this?  If not, is
>>>> it worth still supporting client-side consistent hash based
>>>> key location for the weird but vaguely workable scenario for
>>>> Java-based clients?
>>>>
>>>> Thoughts?
>>>>
>>>> Cheers
>>>> Manik
>>>>
>>>>
>>>> [1] http://community.jboss.org/wiki/HotRodProtocol
>>>> --
>>>> Manik Surtani
>>>> manik at jboss.org
>>>> Lead, Infinispan
>>>> Lead, JBoss Cache
>>>> http://www.infinispan.org
>>>> http://www.jbosscache.org
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>>
>> --
>> Galder Zamarreño
>> Sr. Software Engineer
>> Infinispan, JBoss Cache
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> --
> Manik Surtani
> manik at jboss.org
> Lead, Infinispan
> Lead, JBoss Cache
> http://www.infinispan.org
> http://www.jbosscache.org
>
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


-- 
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache




More information about the infinispan-dev mailing list