[infinispan-dev] HotRod, ClientIntelligence and client-side key location

Manik Surtani manik at jboss.org
Mon Mar 22 08:02:15 EDT 2010


On 22 Mar 2010, at 11:16, Galder Zamarreno wrote:

> See below:
> 
> On Thu, 18 Mar 2010 22:26:44 +0100, Alex Kluge <java_kluge at yahoo.com>  
> wrote:
> 
>> </snip>
>> 
>>> Firstly, this is hard when consumed by non-Java clients as you'd need  
>> to
>>> implement the way the JDK calculates the hash code of a byte array.
>> 
>> This is a much easier problem if you don't use the built in hashing.  
>> There are
>> a number of hash algorithms that can be used, including
>> FNV 1 (http://www.isthe.com/chongo/tech/comp/fnv/)
>> and others http://www.azillionmonkeys.com/qed/hash.html.
>> 
>> These are implementable in other languages, are fast, and provide good  
>> distributions of
>> results. We can use, and similarly document, an integer hash used to  
>> further spread the
>> hash values if needed. If these are chosen to be implementable in  
>> multiple languages,
>> clearly documented, and don't change too often, it should be reasonable  
>> to put them into
>> the client.
>> 
>> Using this approach removes the dependency on the hash that the VM  
>> happens to be
>> using. Indeed, the hash for a byte array may simply be the address of  
>> the array, which
>> makes it very poor for our use.
> 
> This is a very good point. This could even be done internally. If your  
> starting Infinispan normally, use default CHA. If you're starting a Hot  
> Rod server, the implementation could inject one of these CHA which are  
> clearly documented and are easy to implement in other languages. That way  
> you'd get the best of both worlds. You don't need to expose your internal  
> details for normal Infinispan and we have a robust, stable and easy to  
> implement algorithm in other languages.

On no, it shouldn't even be as complex as that.  What I suggest is this:

DefaultConsistentHash currently uses a Wang/Jenkins hash as a bit-spreader on key.hashcode(), and address.hashcode().  

This is poor, since this is JVM dependent.  Secondly (and for a different reason) the W/J hash isn't providing us with adequate spread, as we've found out.  

So step 1 is to identify a better-spread hash, some have been suggested on this thread, preferably one that can operate directly on byte[]'s and eliminate the need for a bit spreader.  

The next step would be to change DefaultConsistentHash to:

* For addresses, use ${HASH_FUNCTION} on address.hashcode()
* For keys which are byte[]s, use ${HASH_FUNCTION} directly on the byte[] (this would directly benefit use via HotRod)
* For keys which are Strings, use ${HASH_FUNCTION} directly on the String (this is an optimisation)
* For keys which are Objects, use ${HASH_FUNCTION} on object.hashcode() (for in-VM use)

We would need to document ${HASH_FUNCTION} as a part of the HotRod protocol, and to successfully locate entries, clients would need the following info:

* Server endpoints on the backend and their address.hashcode() values
* Hash space size (the modulus for all modular arithmetic, hard-coded for now, may change in future).  An int.
* Hash function version.  This could point to details on the spec.  This could be a short.
* Num owners the servers have been configured to use.  This again could be a short as far as HotRod is concerned.

WDYT?

Cheers
Manik

> Cheers,
> 
>>                                                                    Alex
>> 
>> --- On Thu, 3/18/10, Manik Surtani <manik at jboss.org> wrote:
>> 
>>> From: Manik Surtani <manik at jboss.org>
>>> Subject: [infinispan-dev] HotRod, ClientIntelligence and client-side  
>>> key location
>>> To: "infinispan -Dev List" <infinispan-dev at lists.jboss.org>
>>> Date: Thursday, March 18, 2010, 11:59 AM
>>> I've been thinking about how we
>>> handle this, and I think we have a problem with smart
>>> clients where clients have the ability to locate the key on
>>> the server cluster in order to direct the request to the
>>> specific node.
>>> 
>>> The problem is in hash code calculation.  The HotRod
>>> protocol caters for this with regards to calculating node
>>> address hash code by passing this in the topology map (see
>>> "Hasher Client Topology Change Header" in [1]), but the only
>>> way this can be meaningfully used is if the client has the
>>> ability to calculate the hash code of the key in the same
>>> manner the servers do.  Firstly, this is hard when
>>> consumed by non-Java clients as you'd need to implement the
>>> way the JDK calculates the hash code of a byte array.Second, you'd need  
>>> detailed and specific knowledge of any
>>> bit spreading that takes place within Infinispan - and this
>>> is internal implementation detail which may change from
>>> release to release.
>>> 
>>> So the way I see it I can't see how non-Java clients will
>>> be able to locate keys and then direct requests to the
>>> necessary nodes.  In fact, even with Java clients the
>>> only way this could be done would be to send back marshalled
>>> Addresses in the topology map, *and* have the same version
>>> of the Infinispan server libs installed on the client, *and*
>>> ensure that the same JDK/JVM version is used on the
>>> client.
>>> Can we think of a better way to do this?  If not, is
>>> it worth still supporting client-side consistent hash based
>>> key location for the weird but vaguely workable scenario for
>>> Java-based clients?
>>> 
>>> Thoughts?
>>> 
>>> Cheers
>>> Manik
>>> 
>>> 
>>> [1] http://community.jboss.org/wiki/HotRodProtocol
>>> --
>>> Manik Surtani
>>> manik at jboss.org
>>> Lead, Infinispan
>>> Lead, JBoss Cache
>>> http://www.infinispan.org
>>> http://www.jbosscache.org
>>> 
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> 
>> 
>> 
>> 
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> 
> -- 
> Galder Zamarreño
> Sr. Software Engineer
> Infinispan, JBoss Cache
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Manik Surtani
manik at jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org








More information about the infinispan-dev mailing list