[infinispan-dev] HotRod, ClientIntelligence and client-side key location

Fri Mar 19 10:31:18 EDT 2010

On 19 Mar 2010, at 14:20, Manik Surtani wrote:

> 
> On 18 Mar 2010, at 21:26, Alex Kluge wrote:
> 
>> Hi,
>> 
>> I wrestled with the same question when going down this path.
>> 
>>> but the only way this can be meaningfully used is if the client has the ability
>>> to calculate the hash code of the key in the same manner the servers do.
>> 
>> Definitely the case. However, even if the client gets it wrong, the
>> message should still be routed to the correct server.
> 
> Yes, but if this is the norm rather than the exception then there is little point in attempting client-side hashing.
> 
>> 
>>> Firstly, this is hard when consumed by non-Java clients as you'd need to
>>> implement the way the JDK calculates the hash code of a byte array.
>> 
>> This is a much easier problem if you don't use the built in hashing.
> 
> This is true.  The current impl simply calls key.hashcode() and passes the result through a Wang/Jenkins based bit spreader.  I suppose we could be smarter and for keys which are byte arrays (and this will *always* be the case if interacting via HotRod) something like FNV-1 or SuperFastHash could be used.  I was even looking at MurmurHash.
> 
> 	http://sites.google.com/site/murmurhash/
> 
> We probably wouldn't even need a bit spreader in this case, and a MurmurHash impls exist for several platforms.
> 
> 	http://en.wikipedia.org/wiki/MurmurHash

Maybe MurmurHash isn't such a good idea.  A comment in Austin Appleby's C++ impl:

// 2. It will not produce the same results on little-endian and big-endian
//    machines.

So while JVMs shield you from CPU endian-ness, clients would need to be run on machines of the same CPU-endian-ness to make it work.

>> There are
>> a number of hash algorithms that can be used, including 
>> FNV 1 (http://www.isthe.com/chongo/tech/comp/fnv/)
>> and others http://www.azillionmonkeys.com/qed/hash.html.
>> 
>> These are implementable in other languages, are fast, and provide good distributions of
>> results. We can use, and similarly document, an integer hash used to further spread the
>> hash values if needed. If these are chosen to be implementable in multiple languages,
>> clearly documented, and don't change too often, it should be reasonable to put them into
>> the client.
>> 
>> Using this approach removes the dependency on the hash that the VM happens to be
>> using. Indeed, the hash for a byte array may simply be the address of the array, which
>> makes it very poor for our use.
>>                                                                   Alex
>> 
>> --- On Thu, 3/18/10, Manik Surtani <manik at jboss.org> wrote:
>> 
>>> From: Manik Surtani <manik at jboss.org>
>>> Subject: [infinispan-dev] HotRod, ClientIntelligence and client-side key location
>>> To: "infinispan -Dev List" <infinispan-dev at lists.jboss.org>
>>> Date: Thursday, March 18, 2010, 11:59 AM
>>> I've been thinking about how we
>>> handle this, and I think we have a problem with smart
>>> clients where clients have the ability to locate the key on
>>> the server cluster in order to direct the request to the
>>> specific node.
>>> 
>>> The problem is in hash code calculation.  The HotRod
>>> protocol caters for this with regards to calculating node
>>> address hash code by passing this in the topology map (see
>>> "Hasher Client Topology Change Header" in [1]), but the only
>>> way this can be meaningfully used is if the client has the
>>> ability to calculate the hash code of the key in the same
>>> manner the servers do.  Firstly, this is hard when
>>> consumed by non-Java clients as you'd need to implement the
>>> way the JDK calculates the hash code of a byte array. 
>>> Second, you'd need detailed and specific knowledge of any
>>> bit spreading that takes place within Infinispan - and this
>>> is internal implementation detail which may change from
>>> release to release.
>>> 
>>> So the way I see it I can't see how non-Java clients will
>>> be able to locate keys and then direct requests to the
>>> necessary nodes.  In fact, even with Java clients the
>>> only way this could be done would be to send back marshalled
>>> Addresses in the topology map, *and* have the same version
>>> of the Infinispan server libs installed on the client, *and*
>>> ensure that the same JDK/JVM version is used on the
>>> client.  
>>> 
>>> Can we think of a better way to do this?  If not, is
>>> it worth still supporting client-side consistent hash based
>>> key location for the weird but vaguely workable scenario for
>>> Java-based clients?
>>> 
>>> Thoughts?
>>> 
>>> Cheers
>>> Manik
>>> 
>>> 
>>> [1] http://community.jboss.org/wiki/HotRodProtocol
>>> --
>>> Manik Surtani
>>> manik at jboss.org
>>> Lead, Infinispan
>>> Lead, JBoss Cache
>>> http://www.infinispan.org
>>> http://www.jbosscache.org
>>> 
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> --
> Manik Surtani
> manik at jboss.org
> Lead, Infinispan
> Lead, JBoss Cache
> http://www.infinispan.org
> http://www.jbosscache.org
> 
> 
> 
> 

--
Manik Surtani
manik at jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org