[JBoss JIRA] (ISPN-3120) StateConsumerImpl can ignore state received during a rebalance

Monday, 27 May 2013

    [
https://issues.jboss.org/browse/ISPN-3120?page=com.atlassian.jira.plugin....
] 

Dan Berindei commented on ISPN-3120:
------------------------------------

My fix was to only set the waitingForState flag in StateConsumerImpl after all the
segments have been requested. But it turns out there's a way around that: if a node E
requests some segments from another node H, which then dies, the StateConsumerImpl on node
E will first remove the transfers from H, and then request the segments from another node
F. During this time, the waitingForState flag will remain true, so it's possible to
replicate the bug even with my fix.

1. Starting with a 4-node cluster: [E, F, G, H] (topology 6), with numOwners=3. 
2. G leaves, and E sends a REBALANCE_START command with nodes [E, F, H] (topology 8). Some
segments are owned by [F, H] in the current CH and by [F, H, E] in the pending CH.
3. E requests some segments from F and some segments from H.
4. H leaves, and E sends a CH_UPDATE command with nodes [E, F] (topology 9).
5. The StateConsumerImpl on E removes the transfers from H.
6. E receives a StateResponseCommand from F. Because it just removed the transfers from H,
transfersBySegment/transfersBySource maps will remain empty and E will consider it
finished receiving state for topology 9.
7. The StateConsumerImpl on E requests the failed segments from F (via a separate
StateRequestCommand).
10. E receives a StateResponseCommand from F with actual data, but it ignores it because
StateConsumerImpl.updatedKeys == null.

{noformat}
2013-05-23 16:59:59,589 TRACE (transport-thread-2,NodeE)
[org.infinispan.statetransfer.StateConsumerImpl] Requesting transactions for segments [51,
50, 35, 33, 54, 39, 53, 52, 37, 43, 41] of cache dist from node NodeH-13469
2013-05-23 16:59:59,619 TRACE (transport-thread-2,NodeE)
[org.infinispan.statetransfer.StateConsumerImpl] Requesting transactions for segments [49,
55, 57, 56] of cache dist from node NodeF-2452
2013-05-23 16:59:59,623 TRACE (transport-thread-2,NodeE)
[org.infinispan.statetransfer.InboundTransferTask] Requesting segments [51, 50, 35, 33,
54, 39, 53, 52, 37, 43, 41] of cache dist from node NodeH-13469
2013-05-23 16:59:59,617 TRACE (transport-thread-4,NodeE)
[org.infinispan.topology.ClusterTopologyManagerImpl] Attempting to execute command on
self: CacheTopologyControlCommand{cache=dist, type=CH_UPDATE, sender=NodeE-18419,
joinInfo=null, topologyId=9, currentCH=DefaultConsistentHash{numSegments=60, numOwners=3,
members=[NodeE-18419, NodeF-2452]}, pendingCH=DefaultConsistentHash{numSegments=60,
numOwners=3, members=[NodeE-18419, NodeF-2452]}, throwable=null, viewId=3}
2013-05-23 16:59:59,624 TRACE (transport-thread-4,NodeE)
[org.infinispan.statetransfer.StateConsumerImpl] Removing inbound transfers for segments
[51, 50, 35, 33, 54, 39, 53, 52, 37, 43, 41] from source NodeH-13469 for cache dist
2013-05-23 16:59:59,624 TRACE (transport-thread-4,NodeE)
[org.infinispan.statetransfer.StateConsumerImpl] Requesting transactions for segments [51,
50, 35, 33, 54, 39, 53, 52, 37, 43, 41] of cache dist from node NodeF-2452
2013-05-23 16:59:59,624 WARN  (transport-thread-2,NodeE)
[org.infinispan.statetransfer.InboundTransferTask] ISPN000210: Failed to request segments
[51, 50, 35, 33, 54, 39, 53, 52, 37, 43, 41] of cache dist from node NodeH-13469 (node
will not be retried)
2013-05-23 16:59:59,624 TRACE (transport-thread-2,NodeE)
[org.infinispan.statetransfer.InboundTransferTask] Requesting segments [49, 55, 57, 56] of
cache dist from node NodeF-2452
2013-05-23 16:59:59,632 TRACE (transport-thread-2,NodeE)
[org.infinispan.statetransfer.InboundTransferTask] Successfully requested segments [49,
55, 57, 56] of cache dist from node NodeF-2452
2013-05-23 16:59:59,634 TRACE (remote-thread-1,NodeE)
[org.infinispan.statetransfer.InboundTransferTask] Finished receiving state for segments
[49, 55, 57, 56] of cache dist
2013-05-23 16:59:59,634 TRACE (remote-thread-1,NodeE)
[org.infinispan.statetransfer.StateConsumerImpl] Completion of inbound transfer task:
InboundTransferTask{segments=[49, 55, 57, 56], finishedSegments=[49, 55, 57, 56],
unfinishedSegments=[], source=NodeF-2452, isCancelled=false, isStartedSuccessfully=true,
isCompletedSuccessfully=true, topologyId=8, timeout=240000, cacheName=dist} 
2013-05-23 16:59:59,634 DEBUG (remote-thread-1,NodeE)
[org.infinispan.statetransfer.StateConsumerImpl] Finished receiving of segments for cache
dist for topology 9.
2013-05-23 16:59:59,634 TRACE (remote-thread-1,NodeE)
[org.infinispan.statetransfer.StateConsumerImpl] Stop keeping track of changed keys for
state transfer
2013-05-23 16:59:59,652 DEBUG (transport-thread-4,NodeE)
[org.infinispan.statetransfer.StateConsumerImpl] Finished adding inbound state transfer
for segments [35, 33, 39, 37, 43, 41, 51, 50, 54, 53, 52] of cache dist
2013-05-23 16:59:59,652 TRACE (transport-thread-2,NodeE)
[org.infinispan.statetransfer.InboundTransferTask] Requesting segments [51, 50, 35, 33,
54, 39, 53, 52, 37, 43, 41] of cache dist from node NodeF-2452
2013-05-23 16:59:59,652 TRACE (transport-thread-4,NodeE)
[org.infinispan.statetransfer.StateTransferLockImpl] Signalling transaction data received
for topology 9
2013-05-23 16:59:59,661 TRACE (transport-thread-2,NodeE)
[org.infinispan.statetransfer.InboundTransferTask] Successfully requested segments [51,
50, 35, 33, 54, 39, 53, 52, 37, 43, 41] of cache dist from node NodeF-2452
2013-05-23 16:59:59,767 DEBUG (remote-thread-1,NodeE)
[org.infinispan.statetransfer.StateConsumerImpl] Applying new state for segment 43 of
cache dist from node NodeF-2452: received 1 cache entries
2013-05-23 16:59:59,767 TRACE (remote-thread-1,NodeE)
[org.infinispan.statetransfer.StateConsumerImpl] Received keys
[MagicKey#k2{cd331f73@NodeF-2452}] for segment 43 of cache dist from node NodeF-2452
2013-05-23 16:59:59,768 TRACE (remote-thread-1,NodeE)
[org.infinispan.interceptors.EntryWrappingInterceptor] State transfer will not write
key/value MagicKey#k2{cd331f73@NodeF-2452}/v2 because it was already updated by somebody
else
{noformat}

...
 StateConsumerImpl can ignore state received during a rebalance
 --------------------------------------------------------------

                 Key: ISPN-3120
                 URL: https://issues.jboss.org/browse/ISPN-3120
             Project: Infinispan
          Issue Type: Bug
          Components: State transfer
    Affects Versions: 5.3.0.Beta1
            Reporter: Dan Berindei
            Assignee: Dan Berindei
            Priority: Critical
              Labels: 5.2.x
             Fix For: 5.2.7.Final, 5.3.0.CR1, 5.3.0.Final

 This causes random failures in  ConcurrentOverlappingLeaveTest and
ConcurrentNonOverlappingLeaveTest.
 1. Starting with a 4-node cluster: [E, F, G, H] (topology 7). 
 2. F leaves, and E sends a REBALANCE_START command with nodes [E, G, H] (topology 8).
Some segments are owned by [H] in the current CH and by [H, G] in the pending CH.
 3. E reports that it finished receiving state with a REBAlANCE_CONFIRM command.
 4. H leaves, and E sends a CH_UPDATE command with nodes [E, G] (topology 9).
 The segments that were owned by [H] in the previous currentCH are assigned to [E, G] in
the new currentCH (otherwise they wouldn't have any owners).
 5. The StateConsumerImpl on E requests state for the "lost" segments from G.
 6. G confirms the end of the rebalance as well, and E sends a CH_UPDATE command to end
the rebalance (topology 10).
 7. E sends a REBALANCE_START command to assign all segments for [E, G] (topology 11).
 8. While the StateConsumerImpl on E is starting the state transfer, it also receives a
StateResponseCommand for the lost segments from G. 
 9. Because the structures keeping track of the received state are not properly
initialized, E considers it finished receiving state for topology 11.
 10. E receives a StateResponseCommand from G with actual data, but it ignores it because
{{StateConsumerImpl.updatedKeys == null}}.
 {noformat}
 11:30:39,807 DEBUG (transport-thread-4,NodeE:dist) [LocalTopologyManagerImpl] Updating
local consistent hash(es) for cache dist: new topology = CacheTopology{id=7,
currentCH=DefaultConsistentHash{numSegments=60, numOwners=2, members=[NodeE-51027,
NodeG-6339, NodeH-47370]}, pendingCH=null}
 11:30:39,810 DEBUG (transport-thread-3,NodeE:dist) [LocalTopologyManagerImpl] Starting
local rebalance for cache dist, topology = CacheTopology{id=8,
currentCH=DefaultConsistentHash{numSegments=60, numOwners=2, members=[NodeE-51027,
NodeG-6339, NodeH-47370]}, pendingCH=DefaultConsistentHash{numSegments=60, numOwners=2,
members=[NodeE-51027, NodeG-6339, NodeH-47370]}}
 11:30:39,817 DEBUG (transport-thread-3,NodeE:dist) [StateConsumerImpl] Finished receiving
of segments for cache dist for topology 8.
 11:30:39,832 DEBUG (transport-thread-4,NodeE:dist) [LocalTopologyManagerImpl] Updating
local consistent hash(es) for cache dist: new topology = CacheTopology{id=9,
currentCH=DefaultConsistentHash{numSegments=60, numOwners=2, members=[NodeE-51027,
NodeG-6339]}, pendingCH=DefaultConsistentHash{numSegments=60, numOwners=2,
members=[NodeE-51027, NodeG-6339]}}
 11:30:39,834 DEBUG (transport-thread-4,NodeE:dist) [StateConsumerImpl] Adding inbound
state transfer for segments [38, 36, 47, 44, 45] of cache dist
 11:30:39,853 DEBUG (transport-thread-3,NodeE:dist) [LocalTopologyManagerImpl] Starting
local rebalance for cache dist, topology = CacheTopology{id=11,
currentCH=DefaultConsistentHash{numSegments=60, numOwners=2, members=[NodeE-51027,
NodeG-6339]}, pendingCH=DefaultConsistentHash{numSegments=60, numOwners=2,
members=[NodeE-51027, NodeG-6339]}}
 11:30:39,859 TRACE (remote-thread-1,NodeE:) [InboundInvocationHandlerImpl] Calling
perform() on StateResponseCommand{cache=dist, origin=NodeG-6339, topologyId=9}
 11:30:39,864 DEBUG (remote-thread-1,NodeE:dist) [StateConsumerImpl] Finished receiving of
segments for cache dist for topology 11.
 11:30:39,866 TRACE (transport-thread-5,NodeE:dist) [LocalTopologyManagerImpl] Ignoring
consistent hash update 10 for cache dist, we have already received a newer topology 11
 11:30:39,868 TRACE (remote-thread-1,NodeE:) [InboundInvocationHandlerImpl] Calling
perform() on StateResponseCommand{cache=dist, origin=NodeG-6339, topologyId=11}
 11:30:39,872 TRACE (remote-thread-1,NodeE:dist dist) [EntryWrappingInterceptor] State
transfer will not write key/value MagicKey#k3{672f69c9@NodeG-6339}/v3 because it was
already updated by somebody else
 11:30:40,582 ERROR (testng-ConcurrentNonOverlappingLeaveTest:) [UnitTestTestNGListener]
Test
testTransactional(org.infinispan.distribution.rehash.ConcurrentNonOverlappingLeaveTest)
failed.
 java.lang.AssertionError: Fail on owner cache NodeE-51027:
dc.get(MagicKey#k3{672f69c9@NodeG-6339}) returned null!
 {noformat} 
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009