[
https://issues.redhat.com/browse/ISPN-11609?page=com.atlassian.jira.plugi...
]
Dan Berindei commented on ISPN-11609:
-------------------------------------
The test blocks the execution of the {{PutKeyValueCommand}} or {{RemoveCommand}}
replicating the value to the joiner node, but sometimes this command reaches the joiner
before the it finished requesting segments and signalling "transaction data
received". That means when {{BlockingInterceptor}} waits on the {{CyclicBarrier}}, it
prevents state transfer from finishing and sending the {{RebalancePhaseConfirmCommand}}.
{noformat}
00:08:24,392 TRACE (jgroups-4,Test-NodeB:[]) [JGroupsTransport] Test-NodeB received
request 3 from Test-NodeA: SingleRpcCommand{cacheName='defaultcache',
command=PutKeyValueCommand{key=key, value=v1, flags=[SKIP_OWNERSHIP_CHECK],
commandInvocationId=CommandInvocation:Test-NodeA:2226, putIfAbsent=false,
valueMatcher=MATCH_ALWAYS,
metadata=EmbeddedExpirableMetadata{version=SimpleClusteredVersion{topologyId=2,
version=2}, lifespan=-1, maxIdle=-1}, successful=true, topologyId=2}}
00:08:24,395 TRACE (jgroups-5,Test-NodeB:[]) [StateConsumerImpl] Topology update
processed, stateTransferTopologyId = 2, startRebalance = true, pending CH =
ScatteredConsistentHash{ns=256, rebalanced=true, owners = (2)[Test-NodeA: 128, Test-NodeB:
128]}
00:08:24,395 TRACE (jgroups-5,Test-NodeB:[]) [StateConsumerImpl] Unlock State Transfer in
Progress for topology ID 2
00:08:24,395 TRACE (jgroups-5,Test-NodeB:[]) [StateTransferLockImpl] Signalling
transaction data received for topology 2
{noformat}
The {{PutKeyValueCommand}} invocation resumes on the topology update thread and blocks
until the test fails:
{noformat}
00:08:24,396 TRACE (jgroups-5,Test-NodeB:[]) [TrianglePerCacheInboundInvocationHandler]
Calling perform() on SingleRpcCommand{cacheName='defaultcache',
command=PutKeyValueCommand{key=key, value=v1, flags=[SKIP_OWNERSHIP_CHECK],
commandInvocationId=CommandInvocation:Test-NodeA:2226, putIfAbsent=false,
valueMatcher=MATCH_ALWAYS,
metadata=EmbeddedExpirableMetadata{version=SimpleClusteredVersion{topologyId=2,
version=2}, lifespan=-1, maxIdle=-1}, successful=true, topologyId=2}}
00:08:24,397 TRACE (jgroups-5,Test-NodeB:[]) [BlockingInterceptor] Command blocking after
completion of PutKeyValueCommand{key=key, value=v1, flags=[SKIP_OWNERSHIP_CHECK],
commandInvocationId=CommandInvocation:Test-NodeA:2226, putIfAbsent=false,
valueMatcher=MATCH_ALWAYS,
metadata=EmbeddedExpirableMetadata{version=SimpleClusteredVersion{topologyId=2,
version=2}, lifespan=-1, maxIdle=-1}, successful=true, topologyId=2}
00:08:34,418 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed:
org.infinispan.distribution.rehash.StateTransferOverwritingValueTest.testBackupOwnerJoiningDuringReplace[SCATTERED_SYNC,
tx=false]
java.util.concurrent.TimeoutException: Timed out waiting for event
pre_rebalance_confirmation_2_from_Test-NodeB
at org.infinispan.test.fwk.CheckPoint.awaitStrict(CheckPoint.java:50) ~[test-classes/:?]
at org.infinispan.test.fwk.CheckPoint.awaitStrict(CheckPoint.java:40) ~[test-classes/:?]
at
org.infinispan.distribution.rehash.StateTransferOverwritingValueTest.doTest(StateTransferOverwritingValueTest.java:201)
~[test-classes/:?]
at
org.infinispan.distribution.rehash.StateTransferOverwritingValueTest.testBackupOwnerJoiningDuringReplace(StateTransferOverwritingValueTest.java:108)
~[test-classes/:?]
{noformat}
After the {{PutKeyValueCommand}} invocation finishes, {{ScatteredStateConsumerImpl}} can
confirm the state transfer:
{noformat}
00:08:34,438 DEBUG (jgroups-5,Test-NodeB:[]) [CLUSTER] Shutdown while handling command
SingleRpcCommand{cacheName='defaultcache', command=PutKeyValueCommand{key=key,
value=v1, flags=[SKIP_OWNERSHIP_CHECK],
commandInvocationId=CommandInvocation:Test-NodeA:2226, putIfAbsent=false,
valueMatcher=MATCH_ALWAYS,
metadata=EmbeddedExpirableMetadata{version=SimpleClusteredVersion{topologyId=2,
version=2}, lifespan=-1, maxIdle=-1}, successful=true, topologyId=2}}
00:08:34,438 DEBUG (jgroups-5,Test-NodeB:[]) [StateConsumerImpl] Finished receiving of
segments for cache defaultcache for topology 2.
00:08:34,439 TRACE (jgroups-5,Test-NodeB:[]) [JGroupsTransport] Test-NodeB sending command
to Test-NodeA: ConfirmRebalancePhaseCommand{cacheName='defaultcache',
origin=Test-NodeB, throwable=null, topologyId=2, viewId=1}
{noformat}
StateTransferOverwritingValueTest.testBackupOwnerJoiningDuringPut[SCATTERED_SYNC] random
failures
-------------------------------------------------------------------------------------------------
Key: ISPN-11609
URL:
https://issues.redhat.com/browse/ISPN-11609
Project: Infinispan
Issue Type: Bug
Components: Core, Test Suite
Affects Versions: 11.0.0.Dev03
Reporter: Dan Berindei
Assignee: Dan Berindei
Priority: Major
Fix For: 11.0.0.Final
{noformat}
java.util.concurrent.TimeoutException: Timed out waiting for event
pre_rebalance_confirmation_2_from_StateTransferOverwritingValueTest-NodeB
at org.infinispan.test.fwk.CheckPoint.awaitStrict(CheckPoint.java:50)
at org.infinispan.test.fwk.CheckPoint.awaitStrict(CheckPoint.java:40)
at
org.infinispan.distribution.rehash.StateTransferOverwritingValueTest.doTest(StateTransferOverwritingValueTest.java:201)
at
org.infinispan.distribution.rehash.StateTransferOverwritingValueTest.testBackupOwnerJoiningDuringPut(StateTransferOverwritingValueTest.java:96)
{noformat}
--
This message was sent by Atlassian Jira
(v7.13.8#713008)