[infinispan-dev] Hashing generating recipient lists with same address

Manik Surtani manik at jboss.org
Mon May 10 10:44:05 EDT 2010


Comments below.

On 10 May 2010, at 15:35, Manik Surtani wrote:

> Thanks, will investigate.
> 
> On 10 May 2010, at 14:59, Mircea Markus wrote:
> 
>> From https://jira.jboss.org/jira/browse/ISPN-428
>> 
>> Problem: 
>> 1.A starts, B starts see view {A,B} , DistributionManagerImpl.start not called yet because no distributed cache was started 
>> 2. a dist cache is started on A. A's consistent hash sees nodes {A,B} now (as DistributionManagerImpl.start is called) 
>> 3. a dist cache is started on B. The JoinTask fetches A's DCH list of nodes, i.e. {A,B} 
>> 4. B creates a hash function which contains {A,B} (as fetched from A) and itself: {A,B,B} 

The last bit is the problem.  That add should not be allowed.  I have patched trunk to deal with this.  The problem was that the DefaultConsistentHash impl does not use a Set internally (for various reasons) but is expected to exhibit set-like behaviour.  A simple check was added.

>> --- aftert this point DCH in B is unreliable, anyway here is how the timeout happens 
>> 
>> 5. B.put(k,v). B acquires lock on k, then B's DCH indicates that k should be placed on B (!!!). Tries a remote call on B, but it will timeout as the lock on k is already held by user thread that waits 

This happens because JGroups - which removes self from the list of recipients - again does not use a Set to hold the recipients and instead uses a List (Vector to be precise).  And List.remove(self) will only remove the *first instance* of self.  :)  Again, a place where Set-like semantics are expected from a non-Set container.  Anyway, the fix above will solve the problem here too.

Thanks for debugging/testing, Mircea!

Cheers
Manik

>> 
>> In other words, the problem is caused by the fact that the joiner doesn't expect itself to be part of the hash function of the remote cache, but it is. I think that the hash function should check for that, and drop duplicates. 
>> 
>> 
>> UT is ConcurrentStartWithReplTest
>> 
>> On 7 May 2010, at 16:16, Galder Zamarreno wrote:
>> 
>>> 
>>> ----- "Mircea Markus" <mircea.markus at jboss.com> wrote:
>>> 
>>>> I've tried the the same operation sequence on the caches but it works
>>>> without timeout. HR server also defines a cache for it's own purposes,
>>>> I'll try to include that cache as well in the setup and check again.
>>> 
>>> Do you have log for the attempt you did to replicate the issue below with only caches and not HR servers? I'd like to see them to verify it.
>>> 
>>> The other cache you mention is a replicated cache, for topology info. I don't think it has any bearings here.
>>> 
>>>> 
>>>> On 7 May 2010, at 14:20, Manik Surtani wrote:
>>>> 
>>>>> So TopologyChangeTest is a pretty complex test involving HotRod
>>>> clients and servers, etc.  Can this be reproduced in a simpler setting
>>>> - i.e., 2 p2p Infinispan instances, add a third, etc., without any
>>>> HotRod components?
>>>>> 
>>>>> On 6 May 2010, at 17:51, galder at redhat.com wrote:
>>>>> 
>>>>>> Hi all,
>>>>>> 
>>>>>> As indicated on IRC, running
>>>> org.infinispan.client.hotrod.TopologyChangeTest.testTwoMembers() fails
>>>> randomly with replication timeout. It's very easy to replicate. When
>>>> it fails, this is what happens:
>>>>>> 
>>>>>> 1. During rehashing, a new hash is installed:
>>>>>> 2010-05-06 17:54:11,960 4932  TRACE
>>>> [org.infinispan.distribution.DistributionManagerImpl]
>>>> (Rehasher-eq-985:) Installing new consistent hash
>>>> DefaultConsistentHash{addresses ={109=eq-35426, 10032=eq-985,
>>>> 10033=eq-985}, hash space =10240}
>>>>>> 
>>>>>> 2. Rehash finishes and the previous hash is still installed:
>>>>>> 2010-05-06 17:54:11,978 4950  INFO 
>>>> [org.infinispan.distribution.JoinTask] (Rehasher-eq-985:) eq-985
>>>> completed join in 30 milliseconds!
>>>>>> 
>>>>>> 3. A put comes in to eq-985 who decides recipients are [eq-985,
>>>> eq-985]. Most likely, the hash falled somewhere between 109 and 10032
>>>> and since owners are 2, it took the next 2:
>>>>>> 2010-05-06 17:54:12,307 5279  TRACE
>>>> [org.infinispan.remoting.rpc.RpcManagerImpl] (HotRodServerWorker-2-1:)
>>>> eq-985 broadcasting call
>>>> PutKeyValueCommand{key=CacheKey{data=ByteArray{size=9,
>>>> hashCode=d28dfa, array=[-84, -19, 0, 5, 116, 0, 2, 107, 48, ..]}},
>>>> value=CacheValue{data=ByteArray{size=9, array=[-84, -19, 0, 5, 116, 0,
>>>> 2, 118, 48, ..]}, version=281483566645249}, putIfAbsent=false,
>>>> lifespanMillis=-1000, maxIdleTimeMillis=-1000} to recipient list
>>>> [eq-985, eq-985]
>>>>>> 
>>>>>> Everything afterwards is a mess:
>>>>>> 
>>>>>> 4. JGroups removes the local address from the destination. The
>>>> reason Infinispan does not do it it's because the number of recipients
>>>> is 2 and the number of members in the cluster 2, so it thinks it's a
>>>> broadcast:
>>>>>> 2010-05-06 17:54:12,308 5280  TRACE
>>>> [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher]
>>>> (HotRodServerWorker-2-1:) real_dests=[eq-985]
>>>>>> 
>>>>>> 5. JGroups still sends it as a broadcast:
>>>>>> 2010-05-06 17:54:12,308 5280  TRACE [org.jgroups.protocols.TCP]
>>>> (HotRodServerWorker-2-1:) sending msg to null, src=eq-985, headers are
>>>> RequestCorrelator: id=201, type=REQ, id=12, rsp_expected=true, NAKACK:
>>>> [MSG, seqno=5], TCP: [channel_name=Infinispan-Cluster]
>>>>>> 
>>>>>> 6. Another node deals with this and replies:
>>>>>> 2010-05-06 17:54:12,310 5282  TRACE
>>>> [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher]
>>>> (OOB-1,Infinispan-Cluster,eq-35426:) Attempting to execute command:
>>>> SingleRpcCommand{cacheName='___defaultcache',
>>>> command=PutKeyValueCommand{key=CacheKey{data=ByteArray{size=9,
>>>> hashCode=43487e, array=[-84, -19, 0, 5, 116, 0, 2, 107, 48, ..]}},
>>>> value=CacheValue{data=ByteArray{size=9, array=[-84, -19, 0, 5, 116, 0,
>>>> 2, 118, 48, ..]}, version=281483566645249}, putIfAbsent=false,
>>>> lifespanMillis=-1000, maxIdleTimeMillis=-1000}} [sender=eq-985]
>>>>>> ...
>>>>>> 
>>>>>> 7. However, no replies yet from eq-985, so u get:
>>>>>> 2010-05-06 17:54:27,310 20282 TRACE
>>>> [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher]
>>>> (HotRodServerWorker-2-1:) responses: [sender=eq-985, retval=null,
>>>> received=false, suspected=false]
>>>>>> 
>>>>>> 2010-05-06 17:54:27,313 20285 TRACE
>>>> [org.infinispan.remoting.rpc.RpcManagerImpl] (HotRodServerWorker-2-1:)
>>>> replication exception: 
>>>>>> org.infinispan.util.concurrent.TimeoutException: Replication
>>>> timeout for eq-985
>>>>>> 
>>>>>> Now, I don't understand the reason for creating a hash
>>>> 10032=eq-985, 10033=eq-985. Shouldn't keeping 10032=eq-985 be enough?
>>>> Why add 10033=eq-985?
>>>>>> 
>>>>>> Assuming there was a valid case for it, a naive approach would be
>>>> to discard a second node that points to the an address already in the
>>>> recipient list. So, 10032=eq-985 would be accepted for the list but
>>>> when encountering 10033=eq-985, this would be skipped.
>>>>>> 
>>>>>> Finally, I thought waiting for rehashing to finish would solve the
>>>> issue but as u can see in 2., rehashing finished and the hash is still
>>>> in the same shape. Also, I've attached a log file.
>>>>>> 
>>>>>> Cheers,
>>>>>> --
>>>>>> Galder Zamarreño
>>>>>> Sr. Software Engineer
>>>>>> Infinispan, JBoss Cache
>>>>>> 
>>>> <bad2_jgroups-infinispan.log.zip>_______________________________________________
>>>>>> infinispan-dev mailing list
>>>>>> infinispan-dev at lists.jboss.org
>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>> 
>>>>> --
>>>>> Manik Surtani
>>>>> manik at jboss.org
>>>>> Lead, Infinispan
>>>>> Lead, JBoss Cache
>>>>> http://www.infinispan.org
>>>>> http://www.jbosscache.org
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev at lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>> 
>>>> 
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> 
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> --
> Manik Surtani
> manik at jboss.org
> Lead, Infinispan
> Lead, JBoss Cache
> http://www.infinispan.org
> http://www.jbosscache.org
> 
> 
> 
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Manik Surtani
manik at jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org







More information about the infinispan-dev mailing list