[infinispan-issues] [JBoss JIRA] (ISPN-11609) StateTransferOverwritingValueTest.testBackupOwnerJoiningDuringPut[SCATTERED_SYNC] random failures
Dan Berindei (Jira)
issues at jboss.org
Fri Apr 10 03:34:00 EDT 2020
[ https://issues.redhat.com/browse/ISPN-11609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14030276#comment-14030276 ]
Dan Berindei commented on ISPN-11609:
-------------------------------------
The test blocks the execution of the {{PutKeyValueCommand}} or {{RemoveCommand}} replicating the value to the joiner node, but sometimes this command reaches the joiner before the it finished requesting segments and signalling "transaction data received". That means when {{BlockingInterceptor}} waits on the {{CyclicBarrier}}, it prevents state transfer from finishing and sending the {{RebalancePhaseConfirmCommand}}.
{noformat}
00:08:24,392 TRACE (jgroups-4,Test-NodeB:[]) [JGroupsTransport] Test-NodeB received request 3 from Test-NodeA: SingleRpcCommand{cacheName='defaultcache', command=PutKeyValueCommand{key=key, value=v1, flags=[SKIP_OWNERSHIP_CHECK], commandInvocationId=CommandInvocation:Test-NodeA:2226, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{version=SimpleClusteredVersion{topologyId=2, version=2}, lifespan=-1, maxIdle=-1}, successful=true, topologyId=2}}
00:08:24,395 TRACE (jgroups-5,Test-NodeB:[]) [StateConsumerImpl] Topology update processed, stateTransferTopologyId = 2, startRebalance = true, pending CH = ScatteredConsistentHash{ns=256, rebalanced=true, owners = (2)[Test-NodeA: 128, Test-NodeB: 128]}
00:08:24,395 TRACE (jgroups-5,Test-NodeB:[]) [StateConsumerImpl] Unlock State Transfer in Progress for topology ID 2
00:08:24,395 TRACE (jgroups-5,Test-NodeB:[]) [StateTransferLockImpl] Signalling transaction data received for topology 2
{noformat}
The {{PutKeyValueCommand}} invocation resumes on the topology update thread and blocks until the test fails:
{noformat}
00:08:24,396 TRACE (jgroups-5,Test-NodeB:[]) [TrianglePerCacheInboundInvocationHandler] Calling perform() on SingleRpcCommand{cacheName='defaultcache', command=PutKeyValueCommand{key=key, value=v1, flags=[SKIP_OWNERSHIP_CHECK], commandInvocationId=CommandInvocation:Test-NodeA:2226, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{version=SimpleClusteredVersion{topologyId=2, version=2}, lifespan=-1, maxIdle=-1}, successful=true, topologyId=2}}
00:08:24,397 TRACE (jgroups-5,Test-NodeB:[]) [BlockingInterceptor] Command blocking after completion of PutKeyValueCommand{key=key, value=v1, flags=[SKIP_OWNERSHIP_CHECK], commandInvocationId=CommandInvocation:Test-NodeA:2226, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{version=SimpleClusteredVersion{topologyId=2, version=2}, lifespan=-1, maxIdle=-1}, successful=true, topologyId=2}
00:08:34,418 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.distribution.rehash.StateTransferOverwritingValueTest.testBackupOwnerJoiningDuringReplace[SCATTERED_SYNC, tx=false]
java.util.concurrent.TimeoutException: Timed out waiting for event pre_rebalance_confirmation_2_from_Test-NodeB
at org.infinispan.test.fwk.CheckPoint.awaitStrict(CheckPoint.java:50) ~[test-classes/:?]
at org.infinispan.test.fwk.CheckPoint.awaitStrict(CheckPoint.java:40) ~[test-classes/:?]
at org.infinispan.distribution.rehash.StateTransferOverwritingValueTest.doTest(StateTransferOverwritingValueTest.java:201) ~[test-classes/:?]
at org.infinispan.distribution.rehash.StateTransferOverwritingValueTest.testBackupOwnerJoiningDuringReplace(StateTransferOverwritingValueTest.java:108) ~[test-classes/:?]
{noformat}
After the {{PutKeyValueCommand}} invocation finishes, {{ScatteredStateConsumerImpl}} can confirm the state transfer:
{noformat}
00:08:34,438 DEBUG (jgroups-5,Test-NodeB:[]) [CLUSTER] Shutdown while handling command SingleRpcCommand{cacheName='defaultcache', command=PutKeyValueCommand{key=key, value=v1, flags=[SKIP_OWNERSHIP_CHECK], commandInvocationId=CommandInvocation:Test-NodeA:2226, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{version=SimpleClusteredVersion{topologyId=2, version=2}, lifespan=-1, maxIdle=-1}, successful=true, topologyId=2}}
00:08:34,438 DEBUG (jgroups-5,Test-NodeB:[]) [StateConsumerImpl] Finished receiving of segments for cache defaultcache for topology 2.
00:08:34,439 TRACE (jgroups-5,Test-NodeB:[]) [JGroupsTransport] Test-NodeB sending command to Test-NodeA: ConfirmRebalancePhaseCommand{cacheName='defaultcache', origin=Test-NodeB, throwable=null, topologyId=2, viewId=1}
{noformat}
> StateTransferOverwritingValueTest.testBackupOwnerJoiningDuringPut[SCATTERED_SYNC] random failures
> -------------------------------------------------------------------------------------------------
>
> Key: ISPN-11609
> URL: https://issues.redhat.com/browse/ISPN-11609
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite
> Affects Versions: 11.0.0.Dev03
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 11.0.0.Final
>
>
> {noformat}
> java.util.concurrent.TimeoutException: Timed out waiting for event pre_rebalance_confirmation_2_from_StateTransferOverwritingValueTest-NodeB
> at org.infinispan.test.fwk.CheckPoint.awaitStrict(CheckPoint.java:50)
> at org.infinispan.test.fwk.CheckPoint.awaitStrict(CheckPoint.java:40)
> at org.infinispan.distribution.rehash.StateTransferOverwritingValueTest.doTest(StateTransferOverwritingValueTest.java:201)
> at org.infinispan.distribution.rehash.StateTransferOverwritingValueTest.testBackupOwnerJoiningDuringPut(StateTransferOverwritingValueTest.java:96)
> {noformat}
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
More information about the infinispan-issues
mailing list