Re: [infinispan-dev] Denormalizing hashes

Wednesday, 11 December 2013

Hi Radim

Actually, it's me that wrote the denormalization code :)

It was meant as a stop-gap measure before we upgraded the HotRod protocol
to support the segment-based consistent hash, but the denormalization
worked well enough (or so we thought) that we didn't get to changing the
protocol yet.

That's not a big change in itself, but we also wanted to make the
consistent hash per-cache on the client (it's now per-cache manager), and
that's a bit more complicated to do. And it's not like it would have been a
good idea to change this before starting the C++ client, the client would
still have to support the current style of consistent hash.

On Tue, Dec 10, 2013 at 8:17 PM, Radim Vansa <rvansa(a)redhat.com&gt; wrote:

...
 Hi Galder,

 as I am trying to debug some problem in C++ client, I was looking into
 the server code. And I am not sure whether I understand the code
 correctly, but it seems to me that the server denormalizes the
 consistent hash for each client anew (after each topology change or
 client joining). Is this true? Looking into trace logs, I can see stuff
 like

 18:15:17,339 TRACE [org.infinispan.server.hotrod.Encoders$Encoder12$]
 (HotRodServerWorker-12) Writing hash id 639767 for 192.168.11.101:11222

  From denormalizeSegmentHashIds() method I see that this means that we
 have executed the hash function 639768 times just to notify one client.
 Is my understanding correct?

Yes, this happens every time a client joins and/or every time the cache
topology changes.

We could easily cache the result of denormalizeSegmentHashIds, as it only
depends on the number of segments. It's just that I wasn't expecting it to
take so many iterations.

...

 Also, there is nothing like the concept of primary owner, is this right?

The client CH doesn't have a concept of backup owners. But for each (hash
id, server) pair that gets sent to the client, it means all the hash codes
between the previous hash id and this hash id have this server as the
primary owner. The server in the next (hash id, server) pair is the first
backup, and so on.

For each segment, the server generates numOwners (hash id, server) pairs.
That means, for most of the hash codes in the segment, the list of owners
on the client will be the same as the list of owners on the server. But for
0.0002 (leewayFraction) of the hash codes, the client primary owner will be
indeed one of the server backup owners.

...
 I thought that every first request in HotRod will go to primary
owner,
 so that the PUT does not have to do the first hop and is executed
 directly on the primary. But it seems to me that it goes to any of the
 owners (practically random one, as you are only looking for the numOwner
 ids in leeway = on the beginning of the range - then, 99.98% or more
 requests should go to the server with last position in the leeway). This
 looks pretty suboptimal for writes, isn't it?

I'm not sure what you mean here, but I'm pretty sure the request goes to
the correct server because we have a test for it:
ConsistentHashV1IntegrationTest

Cheers
Dan

...

 Cheers

 Radim

 PS: for every line of code you write in Scala, God kills a kitten

 --
 Radim Vansa <rvansa(a)redhat.com&gt;
 JBoss DataGrid QA

 _______________________________________________
 infinispan-dev mailing list
 infinispan-dev(a)lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] Denormalizing hashes