[JBoss JIRA] (ISPN-9257) ClustertopologyManagerTest.testAbruptLeaveAfterGetStatus2[SCATTERED_SYNC, tx=false] random failures
by Will Burns (Jira)
[ https://issues.jboss.org/browse/ISPN-9257?page=com.atlassian.jira.plugin.... ]
Will Burns updated ISPN-9257:
-----------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
> ClustertopologyManagerTest.testAbruptLeaveAfterGetStatus2[SCATTERED_SYNC, tx=false] random failures
> ---------------------------------------------------------------------------------------------------
>
> Key: ISPN-9257
> URL: https://issues.jboss.org/browse/ISPN-9257
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 9.3.0.CR1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Minor
> Labels: testsuite_stability
> Fix For: 10.0.0.Beta4, 9.4.16.Final
>
> Attachments: ISPN-8731_wrong_topology_2018-05-18_ClusterTopologyManagerTest-infinispan-core.log.gz
>
>
> The test kills the coordinator NodeA, then while NodeB is trying to recover the caches it also kills NodeC. It expects NodeB to start a rebalance with 2 nodes and discards it, in order to test that it can process the 1-node rebalance first:
> {noformat}
> 00:34:06,582 DEBUG (transport-thread-test-NodeB-p12-t6:[testCache]) [ClusterTopologyManagerTest] Discarding rebalance command CacheTopology{id=8, phase=TRANSITORY, rebalanceId=5, currentCH=ScatteredConsistentHash{ns=256, rebalanced=false, owners = (2)[test-NodeB-49590: 85, test-NodeC-58596: 85]}, pendingCH=ScatteredConsistentHash{ns=256, rebalanced=true, owners = (2)[test-NodeB-49590: 128, test-NodeC-58596: 128]}, unionCH=null, actualMembers=[test-NodeB-49590, test-NodeC-58596], persistentUUIDs=[6b96414e-15d8-4350-aa3c-4fb4fc34e888, d47dc4a9-2a95-4bb1-a83b-bb8a27c9999f]}
> 00:34:06,609 DEBUG (transport-thread-test-NodeB-p12-t2:[Topology-testCache]) [LocalTopologyManagerImpl] Updating local topology for cache testCache: CacheTopology{id=9, phase=TRANSITORY, rebalanceId=5, currentCH=ScatteredConsistentHash{ns=256, rebalanced=false, owners = (1)[test-NodeB-49590: 85]}, pendingCH=ScatteredConsistentHash{ns=256, rebalanced=false, owners = (1)[test-NodeB-49590: 128]}, unionCH=null, actualMembers=[test-NodeB-49590], persistentUUIDs=[6b96414e-15d8-4350-aa3c-4fb4fc34e888]}
> 00:34:06,609 DEBUG (transport-thread-test-NodeB-p12-t2:[Topology-testCache]) [LocalTopologyManagerImpl] Installing fake cache topology CacheTopology{id=8, phase=NO_REBALANCE, rebalanceId=4, currentCH=ScatteredConsistentHash{ns=256, rebalanced=false, owners = (1)[test-NodeB-49590: 85]}, pendingCH=null, unionCH=null, actualMembers=[test-NodeB-49590], persistentUUIDs=[6b96414e-15d8-4350-aa3c-4fb4fc34e888]} for cache testCache
> {noformat}
> Unfortunately {{PreferAvailabilityStrategy}} has changed a bit and the rebalance ids don't always match the expectations of the test, so that the 1-node rebalance is discarded instead:
> {noformat}
> 09:46:10,530 DEBUG (transport-thread-Test-NodeB-p54539-t3:[testCache]) [Test] Discarding rebalance command CacheTopology{id=9, phase=TRANSITORY, rebalanceId=5, currentCH=ScatteredConsistentHash{ns=256, rebalanced=false, owners = (1)[Test-NodeB-62039: 85]}, pendingCH=ScatteredConsistentHash{ns=256, rebalanced=true, owners = (1)[Test-NodeB-62039: 256]}, unionCH=null, actualMembers=[Test-NodeB-62039], persistentUUIDs=[0ed7be74-4485-489b-baee-28c461c9e5de]}
> {noformat}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 6 months
[JBoss JIRA] (IPROTO-99) Ability to define an empty proto message via Pojo annotations
by Nistor Adrian (Jira)
[ https://issues.jboss.org/browse/IPROTO-99?page=com.atlassian.jira.plugin.... ]
Nistor Adrian updated IPROTO-99:
--------------------------------
Fix Version/s: 4.3.0.Alpha9
> Ability to define an empty proto message via Pojo annotations
> -------------------------------------------------------------
>
> Key: IPROTO-99
> URL: https://issues.jboss.org/browse/IPROTO-99
> Project: Infinispan ProtoStream
> Issue Type: Feature Request
> Affects Versions: 4.3.0.Alpha7
> Reporter: Ryan Emerson
> Assignee: Nistor Adrian
> Priority: Major
> Fix For: 4.3.0.Alpha9
>
>
> In some cases it's useful for a empty protobuf message to be sent, for example formalising an implementation type.
> Currently when trying to define an empty proto message by annotating a class with just {{ProtoName}} annotation, the following error is thrown at compile time:
> {code:java}
> Error:(71, 8) java: Class org.infinispan.notifications.cachelistener.cluster.AbstractClusterListenerUtilTest.StringAppender does not have any @ProtoField annotated fields. The class should be either annotated or it should have a custom marshaller.
> {code}
> It's possible to workaround this by simply creating a {{@Protofield}} annotation on a class variable which is never initialised.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 6 months
[JBoss JIRA] (ISPN-10367) Possible loss of (pessimistic) lock if a transaction will reach timeout and/or is removed
by Wolf-Dieter Fink (Jira)
[ https://issues.jboss.org/browse/ISPN-10367?page=com.atlassian.jira.plugin... ]
Wolf-Dieter Fink updated ISPN-10367:
------------------------------------
Security: (was: Red Hat Internal)
> Possible loss of (pessimistic) lock if a transaction will reach timeout and/or is removed
> -----------------------------------------------------------------------------------------
>
> Key: ISPN-10367
> URL: https://issues.jboss.org/browse/ISPN-10367
> Project: Infinispan
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 10.0.0.Beta3, 9.4.15.Final
> Reporter: Wolf-Dieter Fink
> Assignee: Pedro Ruivo
> Priority: Critical
>
> If entries are locked, no matter whether it was done by FORCE_WRITE_LOCK flag or getAdvancedCache().lock(key), and the lock is hold longer than the current Tx timeout setting (.completedTxTimeout(...) ) the transacaction might be removed
> - if the node is blocked and expelled from the cluster (and join back later)
> - the thread processing the lock will take longer than the Tx-timeout setting
> This force to remove the Tx and free the lock.
> An indicator is the Exception below which will be shown if the Tx is timing out, it is not a (remote) access timout.
> If the originator is back after this this (ongoing) Tx is assumed new and it will continue by accident without lock.
> This can cause unexpected inconsistency!
> ERROR [org.infinispan.interceptors.InvocationContextInterceptor] (EJB timer - 13) ISPN000136: Error executing command LockControlCommand, writing keys []: org.infinispan.util.concurrent.TimeoutException: Replication timeout for lt-33828
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.checkRsp(JGroupsTransport.java:803)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.lambda$invokeRemotelyAsync$1(JGroupsTransport.java:641)
> at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
> at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
> at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
> at org.infinispan.remoting.transport.jgroups.RspListFuture.call(RspListFuture.java:47)
> at org.infinispan.remoting.transport.jgroups.RspListFuture.call(RspListFuture.java:16)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> ERROR [org.jboss.as.ejb3.timer] (EJB timer - 13) WFLYEJB0020: Error invoking timeout for timer: [id=8a53d2c3-190d-4c74-9327-8e7554e1df2c timedObjectId=embeddedStressTest-ejb.embeddedStressTest-ejb.CacheAccessSingletonBean auto-timer?:false persistent?:true timerService=org.jboss.as.ejb3.timerservice.TimerServiceImpl@72c41b07 initialExpiration=Fri Jun 28 10:56:16 CEST 2019 intervalDuration(in milli sec)=1 nextExpiration=Fri Jun 28 10:56:43 CEST 2019 timerState=IN_TIMEOUT info=org.infinispan.wfink.stress.TimerInfo@47ae2053]: javax.ejb.EJBException: org.infinispan.util.concurrent.TimeoutException: Replication timeout for lt-33828
> at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInOurTx(CMTTxInterceptor.java:246)
> at org.jboss.as.ejb3.tx.CMTTxInterceptor.requiresNew(CMTTxInterceptor.java:388)
> at org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:146)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
> at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:509)
> at org.jboss.weld.module.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:81)
> at org.jboss.as.weld.ejb.EjbRequestScopeActivationInterceptor.processInvocation(EjbRequestScopeActivationInterceptor.java:89)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
> at org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
> at org.jboss.as.ejb3.security.SecurityContextInterceptor.processInvocation(SecurityContextInterceptor.java:100)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
> at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:509)
> at org.jboss.as.ejb3.component.singleton.ContainerManagedConcurrencyInterceptor.processInvocation(ContainerManagedConcurrencyInterceptor.java:106)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
> at org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:64)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
> at org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
> at org.jboss.invocation.ContextClassLoaderInterceptor.processInvocation(ContextClassLoaderInterceptor.java:60)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
> at org.jboss.invocation.InterceptorContext.run(InterceptorContext.java:438)
> at org.wildfly.security.manager.WildFlySecurityManager.doChecked(WildFlySecurityManager.java:619)
> at org.jboss.invocation.AccessCheckingInterceptor.processInvocation(AccessCheckingInterceptor.java:57)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
> at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:53)
> at org.jboss.as.ejb3.timerservice.TimedObjectInvokerImpl.callTimeout(TimedObjectInvokerImpl.java:99)
> at org.jboss.as.ejb3.timerservice.TimedObjectInvokerImpl.callTimeout(TimedObjectInvokerImpl.java:109)
> at org.jboss.as.ejb3.timerservice.TimerTask.invokeBeanMethod(TimerTask.java:189)
> at org.jboss.as.ejb3.timerservice.TimerTask.callTimeout(TimerTask.java:185)
> at org.jboss.as.ejb3.timerservice.TimerTask.run(TimerTask.java:159)
> at org.jboss.as.ejb3.timerservice.TimerServiceImpl$Task$1.run(TimerServiceImpl.java:1304)
> at org.wildfly.extension.requestcontroller.RequestController$QueuedTask$1.run(RequestController.java:494)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> at org.jboss.threads.JBossThread.run(JBossThread.java:485)
> Caused by: org.infinispan.util.concurrent.TimeoutException: Replication timeout for lt-33828
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.checkRsp(JGroupsTransport.java:803)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.lambda$invokeRemotelyAsync$1(JGroupsTransport.java:641)
> at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
> at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
> at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
> at org.infinispan.remoting.transport.jgroups.RspListFuture.call(RspListFuture.java:47)
> at org.infinispan.remoting.transport.jgroups.RspListFuture.call(RspListFuture.java:16)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 6 months
[JBoss JIRA] (ISPN-10367) Possible loss of (pessimistic) lock if a transaction will reach timeout and/or is removed
by Wolf-Dieter Fink (Jira)
[ https://issues.jboss.org/browse/ISPN-10367?page=com.atlassian.jira.plugin... ]
Wolf-Dieter Fink updated ISPN-10367:
------------------------------------
Status: Open (was: New)
> Possible loss of (pessimistic) lock if a transaction will reach timeout and/or is removed
> -----------------------------------------------------------------------------------------
>
> Key: ISPN-10367
> URL: https://issues.jboss.org/browse/ISPN-10367
> Project: Infinispan
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 10.0.0.Beta3, 9.4.15.Final
> Reporter: Wolf-Dieter Fink
> Assignee: Pedro Ruivo
> Priority: Critical
>
> If entries are locked, no matter whether it was done by FORCE_WRITE_LOCK flag or getAdvancedCache().lock(key), and the lock is hold longer than the current Tx timeout setting (.completedTxTimeout(...) ) the transacaction might be removed
> - if the node is blocked and expelled from the cluster (and join back later)
> - the thread processing the lock will take longer than the Tx-timeout setting
> This force to remove the Tx and free the lock.
> An indicator is the Exception below which will be shown if the Tx is timing out, it is not a (remote) access timout.
> If the originator is back after this this (ongoing) Tx is assumed new and it will continue by accident without lock.
> This can cause unexpected inconsistency!
> ERROR [org.infinispan.interceptors.InvocationContextInterceptor] (EJB timer - 13) ISPN000136: Error executing command LockControlCommand, writing keys []: org.infinispan.util.concurrent.TimeoutException: Replication timeout for lt-33828
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.checkRsp(JGroupsTransport.java:803)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.lambda$invokeRemotelyAsync$1(JGroupsTransport.java:641)
> at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
> at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
> at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
> at org.infinispan.remoting.transport.jgroups.RspListFuture.call(RspListFuture.java:47)
> at org.infinispan.remoting.transport.jgroups.RspListFuture.call(RspListFuture.java:16)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> ERROR [org.jboss.as.ejb3.timer] (EJB timer - 13) WFLYEJB0020: Error invoking timeout for timer: [id=8a53d2c3-190d-4c74-9327-8e7554e1df2c timedObjectId=embeddedStressTest-ejb.embeddedStressTest-ejb.CacheAccessSingletonBean auto-timer?:false persistent?:true timerService=org.jboss.as.ejb3.timerservice.TimerServiceImpl@72c41b07 initialExpiration=Fri Jun 28 10:56:16 CEST 2019 intervalDuration(in milli sec)=1 nextExpiration=Fri Jun 28 10:56:43 CEST 2019 timerState=IN_TIMEOUT info=org.infinispan.wfink.stress.TimerInfo@47ae2053]: javax.ejb.EJBException: org.infinispan.util.concurrent.TimeoutException: Replication timeout for lt-33828
> at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInOurTx(CMTTxInterceptor.java:246)
> at org.jboss.as.ejb3.tx.CMTTxInterceptor.requiresNew(CMTTxInterceptor.java:388)
> at org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:146)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
> at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:509)
> at org.jboss.weld.module.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:81)
> at org.jboss.as.weld.ejb.EjbRequestScopeActivationInterceptor.processInvocation(EjbRequestScopeActivationInterceptor.java:89)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
> at org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
> at org.jboss.as.ejb3.security.SecurityContextInterceptor.processInvocation(SecurityContextInterceptor.java:100)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
> at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:509)
> at org.jboss.as.ejb3.component.singleton.ContainerManagedConcurrencyInterceptor.processInvocation(ContainerManagedConcurrencyInterceptor.java:106)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
> at org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:64)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
> at org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
> at org.jboss.invocation.ContextClassLoaderInterceptor.processInvocation(ContextClassLoaderInterceptor.java:60)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
> at org.jboss.invocation.InterceptorContext.run(InterceptorContext.java:438)
> at org.wildfly.security.manager.WildFlySecurityManager.doChecked(WildFlySecurityManager.java:619)
> at org.jboss.invocation.AccessCheckingInterceptor.processInvocation(AccessCheckingInterceptor.java:57)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
> at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:53)
> at org.jboss.as.ejb3.timerservice.TimedObjectInvokerImpl.callTimeout(TimedObjectInvokerImpl.java:99)
> at org.jboss.as.ejb3.timerservice.TimedObjectInvokerImpl.callTimeout(TimedObjectInvokerImpl.java:109)
> at org.jboss.as.ejb3.timerservice.TimerTask.invokeBeanMethod(TimerTask.java:189)
> at org.jboss.as.ejb3.timerservice.TimerTask.callTimeout(TimerTask.java:185)
> at org.jboss.as.ejb3.timerservice.TimerTask.run(TimerTask.java:159)
> at org.jboss.as.ejb3.timerservice.TimerServiceImpl$Task$1.run(TimerServiceImpl.java:1304)
> at org.wildfly.extension.requestcontroller.RequestController$QueuedTask$1.run(RequestController.java:494)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> at org.jboss.threads.JBossThread.run(JBossThread.java:485)
> Caused by: org.infinispan.util.concurrent.TimeoutException: Replication timeout for lt-33828
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.checkRsp(JGroupsTransport.java:803)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.lambda$invokeRemotelyAsync$1(JGroupsTransport.java:641)
> at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
> at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
> at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
> at org.infinispan.remoting.transport.jgroups.RspListFuture.call(RspListFuture.java:47)
> at org.infinispan.remoting.transport.jgroups.RspListFuture.call(RspListFuture.java:16)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 6 months
[JBoss JIRA] (ISPN-10366) ScatteredStateConsumerImpl sets segment state to OWNED before applying values
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-10366?page=com.atlassian.jira.plugin... ]
Dan Berindei updated ISPN-10366:
--------------------------------
Description:
{{ScatteredStateConsumerImpl}} uses {{InboundTransferTask}} only to request keys, then after it received all the keys of a segment it changes the segment state to {{VALUE_TRANSFER}} and starts an asynchronous request to fetch the values and replace the {{RemoteMetadata}} entries with real entries.
{{ScatteredStateConsumerImpl.chunkCounter}} is supposed to delay the state transfer end and the segment state change to {{OWNED}}, but in rare occasions this doesn't happen.
This happened in {{PrefetchTest.testPrefetch12}} while running the test suite with {{taskset -c 1-2}}:
{noformat}
21:54:43,304 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Received new topology for cache ___defaultcache, isRebalance = true, isMember = true, topology = CacheTopology{id=9, phase=TRANSITORY, rebalanceId=5, currentCH=PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=false, owners = (2)[Test-NodeA-39104: 1, Test-NodeC-3746: 0]}, pendingCH=PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=true, owners = (2)[Test-NodeA-39104: 0, Test-NodeC-3746: 1]}, unionCH=PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=false, owners = (2)[Test-NodeA-39104: 0, Test-NodeC-3746: 1]}, actualMembers=[Test-NodeA-39104, Test-NodeC-3746], persistentUUIDs=[f58e0a9a-dd4e-429a-8464-da64bf001d4e, 1471096f-c59a-4dc9-8f4d-31fbf399a2aa]}
21:54:43,305 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[StateRequest-___defaultcache]) [ScatteredStateConsumerImpl] Requesting keys for segments {0} from Test-NodeA-39104
21:54:43,313 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Topology update processed, stateTransferTopologyId = 9, startRebalance = true, pending CH = PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=true, owners = (2)[Test-NodeA-39104: 0, Test-NodeC-3746: 1]}
21:54:43,313 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateTransferLockImpl] Signalling transaction data received for topology 9
21:54:43,313 TRACE (remote-thread-Test-NodeC-p69905-t2:[]) [TrianglePerCacheInboundInvocationHandler] Calling perform() on StateResponseCommand{cache=___defaultcache, pushTransfer=false, stateChunks=[StateChunk{segmentId=0, cacheEntries=1, isLastChunk=true}], origin=Test-NodeA-39104, topologyId=9, applyState=true}
21:54:43,313 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [StateConsumerImpl] Applying new state chunk for segment 0 of cache ___defaultcache from node Test-NodeA-39104: received 1 cache entries
21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredVersionManagerImpl] Finished transfer for segment 0 = KEY_TRANSFER -> VALUE_TRANSFER
21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredVersionManagerImpl] Node Test-NodeC-3746, segment 0 has all keys in, expects value transfer
21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Requesting values from segments {0}, for in-memory keys
21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Retrieving values, chunk counter is 1
21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [JGroupsTransport] Test-NodeC-3746 sending request 11 to Test-NodeA-39104: ClusteredGetAllCommand{keys=[key], flags=[SKIP_OWNERSHIP_CHECK], topologyId=9}
21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Invalidating versions on Test-NodeC-3746, chunk counter incremented to 2
21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Versions invalidated on Test-NodeC-3746, chunk counter decremented to 1
21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [StateConsumerImpl] Removing inbound transfers from node {0} for segments Test-NodeA-39104
21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Inbound transfer removed, chunk counter is 1
21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [StateConsumerImpl] Latch 0
21:54:43,315 TRACE (jgroups-7,Test-NodeC-3746:[]) [JGroupsTransport] Test-NodeC-3746 received response for request 11 from Test-NodeA-39104: SuccessfulResponse([MetadataImmortalCacheValue {value=v0, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=SimpleClusteredVersion{topologyId=7, version=1}}}])
21:54:43,316 TRACE (jgroups-7,Test-NodeC-3746:[]) [BlockingInterceptor] Command blocking before completion of PutKeyValueCommand{key=key, value=v0, flags=[CACHE_MODE_LOCAL, SKIP_REMOTE_LOOKUP, PUT_FOR_STATE_TRANSFER, SKIP_SHARED_CACHE_STORE, SKIP_OWNERSHIP_CHECK, IGNORE_RETURN_VALUES, SKIP_XSITE_BACKUP], commandInvocationId=CommandInvocation:Test-NodeC-3746:121294, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=InternalMetadataImpl{actual=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=SimpleClusteredVersion{topologyId=7, version=1}}, created=-1, lastUsed=-1}, successful=true, topologyId=-1}
21:54:43,316 TRACE (remote-thread-Test-NodeC-p69905-t2:[___defaultcache]) [StateConsumerImpl] After applying the received state the data container of cache ___defaultcache has 1 keys
21:54:43,316 TRACE (remote-thread-Test-NodeC-p69905-t2:[___defaultcache]) [StateConsumerImpl] Segments not received yet for cache ___defaultcache: {}
21:54:43,316 DEBUG (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Finished receiving of segments for cache ___defaultcache for topology 9.
21:54:43,316 DEBUG (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [ScatteredVersionManagerImpl] Node Test-NodeC-3746 received values for all segments in topology 9
21:54:43,316 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Stop keeping track of changed keys for state transfer in topology 9
{noformat}
The test then starts a put operation and expects it to prefetch the previous value, but because the segment is {{OWNED}}, the {{RemoteMetadata}} is ignored:
{noformat}
21:54:43,316 TRACE (ForkThread-1,Test:[]) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=key, value=v1, flags=[], commandInvocationId=CommandInvocation:Test-NodeC-3746:121295, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true, topologyId=-1} and InvocationContext [SingleKeyNonTxInvocationContext{isLocked=false, key=null, cacheEntry=null, origin=null, lockOwner=CommandInvocation:Test-NodeC-3746:121295}]
21:54:43,316 TRACE (ForkThread-1,Test:[]) [EntryFactoryImpl] Retrieved from container MetadataImmortalCacheEntry{key=key, value=null, metadata=RemoteMetadata{address=Test-NodeA-39104, version=1}}
21:54:43,316 TRACE (ForkThread-1,Test:[]) [ScatteredDistributionInterceptor] Committing entry RepeatableReadEntry(108d175b){key=key, value=v1, isCreated=false, isChanged=true, isRemoved=false, isExpired=false, skipLookup=true, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=SimpleClusteredVersion{topologyId=9, version=1}}}, replaced MetadataImmortalCacheEntry{key=key, value=null, metadata=RemoteMetadata{address=Test-NodeA-39104, version=1}}
21:54:53,316 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.scattered.statetransfer.PrefetchTest.testPrefetch12
org.infinispan.test.TestException: java.util.concurrent.TimeoutException
at org.infinispan.util.ControlledRpcManager.uncheckedGet(ControlledRpcManager.java:259) ~[test-classes/:?]
at org.infinispan.util.ControlledRpcManager.expectCommand(ControlledRpcManager.java:124) ~[test-classes/:?]
at org.infinispan.scattered.statetransfer.PrefetchTest.testPrefetch(PrefetchTest.java:110) ~[test-classes/:?]
at org.infinispan.scattered.statetransfer.PrefetchTest.testPrefetch12(PrefetchTest.java:67) ~[test-classes/:?]
{noformat}
On a related note, {{StateConsumerImpl.applyState(pushTransfer=true)}} initializes a {{CountDownLatch(stateChunks.size())}}, but doesn't actually count down if {{stateChunk.getCacheEntries() == null}}, potentially hanging state transfer until it times out.
was:
{{ScatteredStateConsumerImpl}} uses {{InboundTransferTask}}s only to request keys, then after it received all the keys of a segment it changes the segment state to {{VALUE_TRANSFER}} and starts an asynchronous request to fetch the values and replace the {{RemoteMetadata}} entries with real entries.
{{ScatteredStateConsumerImpl.chunkCounter}} is supposed to delay the state transfer end and the segment state change to {{OWNED}}, but in rare occasions this doesn't happen.
This happened in {{PrefetchTest.testPrefetch12}} while running the test suite with {{taskset -c 1-2}}:
{noformat}
21:54:43,304 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Received new topology for cache ___defaultcache, isRebalance = true, isMember = true, topology = CacheTopology{id=9, phase=TRANSITORY, rebalanceId=5, currentCH=PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=false, owners = (2)[Test-NodeA-39104: 1, Test-NodeC-3746: 0]}, pendingCH=PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=true, owners = (2)[Test-NodeA-39104: 0, Test-NodeC-3746: 1]}, unionCH=PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=false, owners = (2)[Test-NodeA-39104: 0, Test-NodeC-3746: 1]}, actualMembers=[Test-NodeA-39104, Test-NodeC-3746], persistentUUIDs=[f58e0a9a-dd4e-429a-8464-da64bf001d4e, 1471096f-c59a-4dc9-8f4d-31fbf399a2aa]}
21:54:43,305 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[StateRequest-___defaultcache]) [ScatteredStateConsumerImpl] Requesting keys for segments {0} from Test-NodeA-39104
21:54:43,313 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Topology update processed, stateTransferTopologyId = 9, startRebalance = true, pending CH = PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=true, owners = (2)[Test-NodeA-39104: 0, Test-NodeC-3746: 1]}
21:54:43,313 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateTransferLockImpl] Signalling transaction data received for topology 9
21:54:43,313 TRACE (remote-thread-Test-NodeC-p69905-t2:[]) [TrianglePerCacheInboundInvocationHandler] Calling perform() on StateResponseCommand{cache=___defaultcache, pushTransfer=false, stateChunks=[StateChunk{segmentId=0, cacheEntries=1, isLastChunk=true}], origin=Test-NodeA-39104, topologyId=9, applyState=true}
21:54:43,313 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [StateConsumerImpl] Applying new state chunk for segment 0 of cache ___defaultcache from node Test-NodeA-39104: received 1 cache entries
21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredVersionManagerImpl] Finished transfer for segment 0 = KEY_TRANSFER -> VALUE_TRANSFER
21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredVersionManagerImpl] Node Test-NodeC-3746, segment 0 has all keys in, expects value transfer
21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Requesting values from segments {0}, for in-memory keys
21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Retrieving values, chunk counter is 1
21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [JGroupsTransport] Test-NodeC-3746 sending request 11 to Test-NodeA-39104: ClusteredGetAllCommand{keys=[key], flags=[SKIP_OWNERSHIP_CHECK], topologyId=9}
21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Invalidating versions on Test-NodeC-3746, chunk counter incremented to 2
21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Versions invalidated on Test-NodeC-3746, chunk counter decremented to 1
21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [StateConsumerImpl] Removing inbound transfers from node {0} for segments Test-NodeA-39104
21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Inbound transfer removed, chunk counter is 1
21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [StateConsumerImpl] Latch 0
21:54:43,315 TRACE (jgroups-7,Test-NodeC-3746:[]) [JGroupsTransport] Test-NodeC-3746 received response for request 11 from Test-NodeA-39104: SuccessfulResponse([MetadataImmortalCacheValue {value=v0, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=SimpleClusteredVersion{topologyId=7, version=1}}}])
21:54:43,316 TRACE (jgroups-7,Test-NodeC-3746:[]) [BlockingInterceptor] Command blocking before completion of PutKeyValueCommand{key=key, value=v0, flags=[CACHE_MODE_LOCAL, SKIP_REMOTE_LOOKUP, PUT_FOR_STATE_TRANSFER, SKIP_SHARED_CACHE_STORE, SKIP_OWNERSHIP_CHECK, IGNORE_RETURN_VALUES, SKIP_XSITE_BACKUP], commandInvocationId=CommandInvocation:Test-NodeC-3746:121294, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=InternalMetadataImpl{actual=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=SimpleClusteredVersion{topologyId=7, version=1}}, created=-1, lastUsed=-1}, successful=true, topologyId=-1}
21:54:43,316 TRACE (remote-thread-Test-NodeC-p69905-t2:[___defaultcache]) [StateConsumerImpl] After applying the received state the data container of cache ___defaultcache has 1 keys
21:54:43,316 TRACE (remote-thread-Test-NodeC-p69905-t2:[___defaultcache]) [StateConsumerImpl] Segments not received yet for cache ___defaultcache: {}
21:54:43,316 DEBUG (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Finished receiving of segments for cache ___defaultcache for topology 9.
21:54:43,316 DEBUG (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [ScatteredVersionManagerImpl] Node Test-NodeC-3746 received values for all segments in topology 9
21:54:43,316 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Stop keeping track of changed keys for state transfer in topology 9
{noformat}
The test then starts a put operation and expects it to prefetch the previous value, but because the segment is {{OWNED}}, the {{RemoteMetadata}} is ignored:
{noformat}
21:54:43,316 TRACE (ForkThread-1,Test:[]) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=key, value=v1, flags=[], commandInvocationId=CommandInvocation:Test-NodeC-3746:121295, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true, topologyId=-1} and InvocationContext [SingleKeyNonTxInvocationContext{isLocked=false, key=null, cacheEntry=null, origin=null, lockOwner=CommandInvocation:Test-NodeC-3746:121295}]
21:54:43,316 TRACE (ForkThread-1,Test:[]) [EntryFactoryImpl] Retrieved from container MetadataImmortalCacheEntry{key=key, value=null, metadata=RemoteMetadata{address=Test-NodeA-39104, version=1}}
21:54:43,316 TRACE (ForkThread-1,Test:[]) [ScatteredDistributionInterceptor] Committing entry RepeatableReadEntry(108d175b){key=key, value=v1, isCreated=false, isChanged=true, isRemoved=false, isExpired=false, skipLookup=true, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=SimpleClusteredVersion{topologyId=9, version=1}}}, replaced MetadataImmortalCacheEntry{key=key, value=null, metadata=RemoteMetadata{address=Test-NodeA-39104, version=1}}
21:54:53,316 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.scattered.statetransfer.PrefetchTest.testPrefetch12
org.infinispan.test.TestException: java.util.concurrent.TimeoutException
at org.infinispan.util.ControlledRpcManager.uncheckedGet(ControlledRpcManager.java:259) ~[test-classes/:?]
at org.infinispan.util.ControlledRpcManager.expectCommand(ControlledRpcManager.java:124) ~[test-classes/:?]
at org.infinispan.scattered.statetransfer.PrefetchTest.testPrefetch(PrefetchTest.java:110) ~[test-classes/:?]
at org.infinispan.scattered.statetransfer.PrefetchTest.testPrefetch12(PrefetchTest.java:67) ~[test-classes/:?]
{noformat}
On a related note, {{StateConsumerImpl.applyState(pushTransfer=true)}} initializes a {{CountDownLatch(stateChunks.size())}}, but doesn't actually count down if {{stateChunk.getCacheEntries() == null}}, potentially hanging state transfer until it times out.
> ScatteredStateConsumerImpl sets segment state to OWNED before applying values
> -----------------------------------------------------------------------------
>
> Key: ISPN-10366
> URL: https://issues.jboss.org/browse/ISPN-10366
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 10.0.0.Beta3, 9.4.15.Final
> Reporter: Dan Berindei
> Assignee: Radim Vansa
> Priority: Major
> Labels: testsuite_stability
> Fix For: 10.0.0.Beta4
>
> Attachments: ISPN-10363_LazyInitializingExecutorService_94x_20190627-2010_PrefetchTest-infinispan-core.log.gz
>
>
> {{ScatteredStateConsumerImpl}} uses {{InboundTransferTask}} only to request keys, then after it received all the keys of a segment it changes the segment state to {{VALUE_TRANSFER}} and starts an asynchronous request to fetch the values and replace the {{RemoteMetadata}} entries with real entries.
> {{ScatteredStateConsumerImpl.chunkCounter}} is supposed to delay the state transfer end and the segment state change to {{OWNED}}, but in rare occasions this doesn't happen.
> This happened in {{PrefetchTest.testPrefetch12}} while running the test suite with {{taskset -c 1-2}}:
> {noformat}
> 21:54:43,304 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Received new topology for cache ___defaultcache, isRebalance = true, isMember = true, topology = CacheTopology{id=9, phase=TRANSITORY, rebalanceId=5, currentCH=PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=false, owners = (2)[Test-NodeA-39104: 1, Test-NodeC-3746: 0]}, pendingCH=PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=true, owners = (2)[Test-NodeA-39104: 0, Test-NodeC-3746: 1]}, unionCH=PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=false, owners = (2)[Test-NodeA-39104: 0, Test-NodeC-3746: 1]}, actualMembers=[Test-NodeA-39104, Test-NodeC-3746], persistentUUIDs=[f58e0a9a-dd4e-429a-8464-da64bf001d4e, 1471096f-c59a-4dc9-8f4d-31fbf399a2aa]}
> 21:54:43,305 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[StateRequest-___defaultcache]) [ScatteredStateConsumerImpl] Requesting keys for segments {0} from Test-NodeA-39104
> 21:54:43,313 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Topology update processed, stateTransferTopologyId = 9, startRebalance = true, pending CH = PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=true, owners = (2)[Test-NodeA-39104: 0, Test-NodeC-3746: 1]}
> 21:54:43,313 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateTransferLockImpl] Signalling transaction data received for topology 9
> 21:54:43,313 TRACE (remote-thread-Test-NodeC-p69905-t2:[]) [TrianglePerCacheInboundInvocationHandler] Calling perform() on StateResponseCommand{cache=___defaultcache, pushTransfer=false, stateChunks=[StateChunk{segmentId=0, cacheEntries=1, isLastChunk=true}], origin=Test-NodeA-39104, topologyId=9, applyState=true}
> 21:54:43,313 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [StateConsumerImpl] Applying new state chunk for segment 0 of cache ___defaultcache from node Test-NodeA-39104: received 1 cache entries
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredVersionManagerImpl] Finished transfer for segment 0 = KEY_TRANSFER -> VALUE_TRANSFER
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredVersionManagerImpl] Node Test-NodeC-3746, segment 0 has all keys in, expects value transfer
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Requesting values from segments {0}, for in-memory keys
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Retrieving values, chunk counter is 1
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [JGroupsTransport] Test-NodeC-3746 sending request 11 to Test-NodeA-39104: ClusteredGetAllCommand{keys=[key], flags=[SKIP_OWNERSHIP_CHECK], topologyId=9}
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Invalidating versions on Test-NodeC-3746, chunk counter incremented to 2
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Versions invalidated on Test-NodeC-3746, chunk counter decremented to 1
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [StateConsumerImpl] Removing inbound transfers from node {0} for segments Test-NodeA-39104
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Inbound transfer removed, chunk counter is 1
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [StateConsumerImpl] Latch 0
> 21:54:43,315 TRACE (jgroups-7,Test-NodeC-3746:[]) [JGroupsTransport] Test-NodeC-3746 received response for request 11 from Test-NodeA-39104: SuccessfulResponse([MetadataImmortalCacheValue {value=v0, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=SimpleClusteredVersion{topologyId=7, version=1}}}])
> 21:54:43,316 TRACE (jgroups-7,Test-NodeC-3746:[]) [BlockingInterceptor] Command blocking before completion of PutKeyValueCommand{key=key, value=v0, flags=[CACHE_MODE_LOCAL, SKIP_REMOTE_LOOKUP, PUT_FOR_STATE_TRANSFER, SKIP_SHARED_CACHE_STORE, SKIP_OWNERSHIP_CHECK, IGNORE_RETURN_VALUES, SKIP_XSITE_BACKUP], commandInvocationId=CommandInvocation:Test-NodeC-3746:121294, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=InternalMetadataImpl{actual=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=SimpleClusteredVersion{topologyId=7, version=1}}, created=-1, lastUsed=-1}, successful=true, topologyId=-1}
> 21:54:43,316 TRACE (remote-thread-Test-NodeC-p69905-t2:[___defaultcache]) [StateConsumerImpl] After applying the received state the data container of cache ___defaultcache has 1 keys
> 21:54:43,316 TRACE (remote-thread-Test-NodeC-p69905-t2:[___defaultcache]) [StateConsumerImpl] Segments not received yet for cache ___defaultcache: {}
> 21:54:43,316 DEBUG (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Finished receiving of segments for cache ___defaultcache for topology 9.
> 21:54:43,316 DEBUG (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [ScatteredVersionManagerImpl] Node Test-NodeC-3746 received values for all segments in topology 9
> 21:54:43,316 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Stop keeping track of changed keys for state transfer in topology 9
> {noformat}
> The test then starts a put operation and expects it to prefetch the previous value, but because the segment is {{OWNED}}, the {{RemoteMetadata}} is ignored:
> {noformat}
> 21:54:43,316 TRACE (ForkThread-1,Test:[]) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=key, value=v1, flags=[], commandInvocationId=CommandInvocation:Test-NodeC-3746:121295, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true, topologyId=-1} and InvocationContext [SingleKeyNonTxInvocationContext{isLocked=false, key=null, cacheEntry=null, origin=null, lockOwner=CommandInvocation:Test-NodeC-3746:121295}]
> 21:54:43,316 TRACE (ForkThread-1,Test:[]) [EntryFactoryImpl] Retrieved from container MetadataImmortalCacheEntry{key=key, value=null, metadata=RemoteMetadata{address=Test-NodeA-39104, version=1}}
> 21:54:43,316 TRACE (ForkThread-1,Test:[]) [ScatteredDistributionInterceptor] Committing entry RepeatableReadEntry(108d175b){key=key, value=v1, isCreated=false, isChanged=true, isRemoved=false, isExpired=false, skipLookup=true, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=SimpleClusteredVersion{topologyId=9, version=1}}}, replaced MetadataImmortalCacheEntry{key=key, value=null, metadata=RemoteMetadata{address=Test-NodeA-39104, version=1}}
> 21:54:53,316 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.scattered.statetransfer.PrefetchTest.testPrefetch12
> org.infinispan.test.TestException: java.util.concurrent.TimeoutException
> at org.infinispan.util.ControlledRpcManager.uncheckedGet(ControlledRpcManager.java:259) ~[test-classes/:?]
> at org.infinispan.util.ControlledRpcManager.expectCommand(ControlledRpcManager.java:124) ~[test-classes/:?]
> at org.infinispan.scattered.statetransfer.PrefetchTest.testPrefetch(PrefetchTest.java:110) ~[test-classes/:?]
> at org.infinispan.scattered.statetransfer.PrefetchTest.testPrefetch12(PrefetchTest.java:67) ~[test-classes/:?]
> {noformat}
> On a related note, {{StateConsumerImpl.applyState(pushTransfer=true)}} initializes a {{CountDownLatch(stateChunks.size())}}, but doesn't actually count down if {{stateChunk.getCacheEntries() == null}}, potentially hanging state transfer until it times out.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 6 months
[JBoss JIRA] (ISPN-10366) ScatteredStateConsumerImpl sets segment state to OWNED before applying values
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-10366?page=com.atlassian.jira.plugin... ]
Dan Berindei commented on ISPN-10366:
-------------------------------------
I think the problem is that {{StateConsumerImpl.onTopologyUpdate}} doesn't know about {{chunkCounter}} and calls {{notifyEndOfStateTransferIfNeeded}}, which calls {{stopApplyingState()}} and changes the state of all the segments to {{OWNED}}. It helped that the {{StateResponseCommand}} with the keys had already arrived, and was just waiting for {{stateTransferLock.transactionDataReceived()}} call to run and remove the inbound transfer.
> ScatteredStateConsumerImpl sets segment state to OWNED before applying values
> -----------------------------------------------------------------------------
>
> Key: ISPN-10366
> URL: https://issues.jboss.org/browse/ISPN-10366
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 10.0.0.Beta3, 9.4.15.Final
> Reporter: Dan Berindei
> Assignee: Radim Vansa
> Priority: Major
> Labels: testsuite_stability
> Fix For: 10.0.0.Beta4
>
> Attachments: ISPN-10363_LazyInitializingExecutorService_94x_20190627-2010_PrefetchTest-infinispan-core.log.gz
>
>
> {{ScatteredStateConsumerImpl}} uses {{InboundTransferTask}}s only to request keys, then after it received all the keys of a segment it changes the segment state to {{VALUE_TRANSFER}} and starts an asynchronous request to fetch the values and replace the {{RemoteMetadata}} entries with real entries.
> {{ScatteredStateConsumerImpl.chunkCounter}} is supposed to delay the state transfer end and the segment state change to {{OWNED}}, but in rare occasions this doesn't happen.
> This happened in {{PrefetchTest.testPrefetch12}} while running the test suite with {{taskset -c 1-2}}:
> {noformat}
> 21:54:43,304 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Received new topology for cache ___defaultcache, isRebalance = true, isMember = true, topology = CacheTopology{id=9, phase=TRANSITORY, rebalanceId=5, currentCH=PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=false, owners = (2)[Test-NodeA-39104: 1, Test-NodeC-3746: 0]}, pendingCH=PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=true, owners = (2)[Test-NodeA-39104: 0, Test-NodeC-3746: 1]}, unionCH=PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=false, owners = (2)[Test-NodeA-39104: 0, Test-NodeC-3746: 1]}, actualMembers=[Test-NodeA-39104, Test-NodeC-3746], persistentUUIDs=[f58e0a9a-dd4e-429a-8464-da64bf001d4e, 1471096f-c59a-4dc9-8f4d-31fbf399a2aa]}
> 21:54:43,305 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[StateRequest-___defaultcache]) [ScatteredStateConsumerImpl] Requesting keys for segments {0} from Test-NodeA-39104
> 21:54:43,313 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Topology update processed, stateTransferTopologyId = 9, startRebalance = true, pending CH = PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=true, owners = (2)[Test-NodeA-39104: 0, Test-NodeC-3746: 1]}
> 21:54:43,313 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateTransferLockImpl] Signalling transaction data received for topology 9
> 21:54:43,313 TRACE (remote-thread-Test-NodeC-p69905-t2:[]) [TrianglePerCacheInboundInvocationHandler] Calling perform() on StateResponseCommand{cache=___defaultcache, pushTransfer=false, stateChunks=[StateChunk{segmentId=0, cacheEntries=1, isLastChunk=true}], origin=Test-NodeA-39104, topologyId=9, applyState=true}
> 21:54:43,313 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [StateConsumerImpl] Applying new state chunk for segment 0 of cache ___defaultcache from node Test-NodeA-39104: received 1 cache entries
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredVersionManagerImpl] Finished transfer for segment 0 = KEY_TRANSFER -> VALUE_TRANSFER
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredVersionManagerImpl] Node Test-NodeC-3746, segment 0 has all keys in, expects value transfer
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Requesting values from segments {0}, for in-memory keys
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Retrieving values, chunk counter is 1
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [JGroupsTransport] Test-NodeC-3746 sending request 11 to Test-NodeA-39104: ClusteredGetAllCommand{keys=[key], flags=[SKIP_OWNERSHIP_CHECK], topologyId=9}
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Invalidating versions on Test-NodeC-3746, chunk counter incremented to 2
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Versions invalidated on Test-NodeC-3746, chunk counter decremented to 1
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [StateConsumerImpl] Removing inbound transfers from node {0} for segments Test-NodeA-39104
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Inbound transfer removed, chunk counter is 1
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [StateConsumerImpl] Latch 0
> 21:54:43,315 TRACE (jgroups-7,Test-NodeC-3746:[]) [JGroupsTransport] Test-NodeC-3746 received response for request 11 from Test-NodeA-39104: SuccessfulResponse([MetadataImmortalCacheValue {value=v0, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=SimpleClusteredVersion{topologyId=7, version=1}}}])
> 21:54:43,316 TRACE (jgroups-7,Test-NodeC-3746:[]) [BlockingInterceptor] Command blocking before completion of PutKeyValueCommand{key=key, value=v0, flags=[CACHE_MODE_LOCAL, SKIP_REMOTE_LOOKUP, PUT_FOR_STATE_TRANSFER, SKIP_SHARED_CACHE_STORE, SKIP_OWNERSHIP_CHECK, IGNORE_RETURN_VALUES, SKIP_XSITE_BACKUP], commandInvocationId=CommandInvocation:Test-NodeC-3746:121294, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=InternalMetadataImpl{actual=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=SimpleClusteredVersion{topologyId=7, version=1}}, created=-1, lastUsed=-1}, successful=true, topologyId=-1}
> 21:54:43,316 TRACE (remote-thread-Test-NodeC-p69905-t2:[___defaultcache]) [StateConsumerImpl] After applying the received state the data container of cache ___defaultcache has 1 keys
> 21:54:43,316 TRACE (remote-thread-Test-NodeC-p69905-t2:[___defaultcache]) [StateConsumerImpl] Segments not received yet for cache ___defaultcache: {}
> 21:54:43,316 DEBUG (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Finished receiving of segments for cache ___defaultcache for topology 9.
> 21:54:43,316 DEBUG (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [ScatteredVersionManagerImpl] Node Test-NodeC-3746 received values for all segments in topology 9
> 21:54:43,316 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Stop keeping track of changed keys for state transfer in topology 9
> {noformat}
> The test then starts a put operation and expects it to prefetch the previous value, but because the segment is {{OWNED}}, the {{RemoteMetadata}} is ignored:
> {noformat}
> 21:54:43,316 TRACE (ForkThread-1,Test:[]) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=key, value=v1, flags=[], commandInvocationId=CommandInvocation:Test-NodeC-3746:121295, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true, topologyId=-1} and InvocationContext [SingleKeyNonTxInvocationContext{isLocked=false, key=null, cacheEntry=null, origin=null, lockOwner=CommandInvocation:Test-NodeC-3746:121295}]
> 21:54:43,316 TRACE (ForkThread-1,Test:[]) [EntryFactoryImpl] Retrieved from container MetadataImmortalCacheEntry{key=key, value=null, metadata=RemoteMetadata{address=Test-NodeA-39104, version=1}}
> 21:54:43,316 TRACE (ForkThread-1,Test:[]) [ScatteredDistributionInterceptor] Committing entry RepeatableReadEntry(108d175b){key=key, value=v1, isCreated=false, isChanged=true, isRemoved=false, isExpired=false, skipLookup=true, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=SimpleClusteredVersion{topologyId=9, version=1}}}, replaced MetadataImmortalCacheEntry{key=key, value=null, metadata=RemoteMetadata{address=Test-NodeA-39104, version=1}}
> 21:54:53,316 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.scattered.statetransfer.PrefetchTest.testPrefetch12
> org.infinispan.test.TestException: java.util.concurrent.TimeoutException
> at org.infinispan.util.ControlledRpcManager.uncheckedGet(ControlledRpcManager.java:259) ~[test-classes/:?]
> at org.infinispan.util.ControlledRpcManager.expectCommand(ControlledRpcManager.java:124) ~[test-classes/:?]
> at org.infinispan.scattered.statetransfer.PrefetchTest.testPrefetch(PrefetchTest.java:110) ~[test-classes/:?]
> at org.infinispan.scattered.statetransfer.PrefetchTest.testPrefetch12(PrefetchTest.java:67) ~[test-classes/:?]
> {noformat}
> On a related note, {{StateConsumerImpl.applyState(pushTransfer=true)}} initializes a {{CountDownLatch(stateChunks.size())}}, but doesn't actually count down if {{stateChunk.getCacheEntries() == null}}, potentially hanging state transfer until it times out.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 6 months
[JBoss JIRA] (ISPN-10366) ScatteredStateConsumerImpl sets segment state to OWNED before applying values
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-10366?page=com.atlassian.jira.plugin... ]
Dan Berindei updated ISPN-10366:
--------------------------------
Attachment: ISPN-10363_LazyInitializingExecutorService_94x_20190627-2010_PrefetchTest-infinispan-core.log.gz
> ScatteredStateConsumerImpl sets segment state to OWNED before applying values
> -----------------------------------------------------------------------------
>
> Key: ISPN-10366
> URL: https://issues.jboss.org/browse/ISPN-10366
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 10.0.0.Beta3, 9.4.15.Final
> Reporter: Dan Berindei
> Assignee: Radim Vansa
> Priority: Major
> Labels: testsuite_stability
> Fix For: 10.0.0.Beta4
>
> Attachments: ISPN-10363_LazyInitializingExecutorService_94x_20190627-2010_PrefetchTest-infinispan-core.log.gz
>
>
> {{ScatteredStateConsumerImpl}} uses {{InboundTransferTask}}s only to request keys, then after it received all the keys of a segment it changes the segment state to {{VALUE_TRANSFER}} and starts an asynchronous request to fetch the values and replace the {{RemoteMetadata}} entries with real entries.
> {{ScatteredStateConsumerImpl.chunkCounter}} is supposed to delay the state transfer end and the segment state change to {{OWNED}}, but in rare occasions this doesn't happen.
> This happened in {{PrefetchTest.testPrefetch12}} while running the test suite with {{taskset -c 1-2}}:
> {noformat}
> 21:54:43,304 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Received new topology for cache ___defaultcache, isRebalance = true, isMember = true, topology = CacheTopology{id=9, phase=TRANSITORY, rebalanceId=5, currentCH=PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=false, owners = (2)[Test-NodeA-39104: 1, Test-NodeC-3746: 0]}, pendingCH=PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=true, owners = (2)[Test-NodeA-39104: 0, Test-NodeC-3746: 1]}, unionCH=PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=false, owners = (2)[Test-NodeA-39104: 0, Test-NodeC-3746: 1]}, actualMembers=[Test-NodeA-39104, Test-NodeC-3746], persistentUUIDs=[f58e0a9a-dd4e-429a-8464-da64bf001d4e, 1471096f-c59a-4dc9-8f4d-31fbf399a2aa]}
> 21:54:43,305 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[StateRequest-___defaultcache]) [ScatteredStateConsumerImpl] Requesting keys for segments {0} from Test-NodeA-39104
> 21:54:43,313 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Topology update processed, stateTransferTopologyId = 9, startRebalance = true, pending CH = PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=true, owners = (2)[Test-NodeA-39104: 0, Test-NodeC-3746: 1]}
> 21:54:43,313 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateTransferLockImpl] Signalling transaction data received for topology 9
> 21:54:43,313 TRACE (remote-thread-Test-NodeC-p69905-t2:[]) [TrianglePerCacheInboundInvocationHandler] Calling perform() on StateResponseCommand{cache=___defaultcache, pushTransfer=false, stateChunks=[StateChunk{segmentId=0, cacheEntries=1, isLastChunk=true}], origin=Test-NodeA-39104, topologyId=9, applyState=true}
> 21:54:43,313 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [StateConsumerImpl] Applying new state chunk for segment 0 of cache ___defaultcache from node Test-NodeA-39104: received 1 cache entries
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredVersionManagerImpl] Finished transfer for segment 0 = KEY_TRANSFER -> VALUE_TRANSFER
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredVersionManagerImpl] Node Test-NodeC-3746, segment 0 has all keys in, expects value transfer
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Requesting values from segments {0}, for in-memory keys
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Retrieving values, chunk counter is 1
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [JGroupsTransport] Test-NodeC-3746 sending request 11 to Test-NodeA-39104: ClusteredGetAllCommand{keys=[key], flags=[SKIP_OWNERSHIP_CHECK], topologyId=9}
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Invalidating versions on Test-NodeC-3746, chunk counter incremented to 2
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Versions invalidated on Test-NodeC-3746, chunk counter decremented to 1
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [StateConsumerImpl] Removing inbound transfers from node {0} for segments Test-NodeA-39104
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Inbound transfer removed, chunk counter is 1
> 21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [StateConsumerImpl] Latch 0
> 21:54:43,315 TRACE (jgroups-7,Test-NodeC-3746:[]) [JGroupsTransport] Test-NodeC-3746 received response for request 11 from Test-NodeA-39104: SuccessfulResponse([MetadataImmortalCacheValue {value=v0, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=SimpleClusteredVersion{topologyId=7, version=1}}}])
> 21:54:43,316 TRACE (jgroups-7,Test-NodeC-3746:[]) [BlockingInterceptor] Command blocking before completion of PutKeyValueCommand{key=key, value=v0, flags=[CACHE_MODE_LOCAL, SKIP_REMOTE_LOOKUP, PUT_FOR_STATE_TRANSFER, SKIP_SHARED_CACHE_STORE, SKIP_OWNERSHIP_CHECK, IGNORE_RETURN_VALUES, SKIP_XSITE_BACKUP], commandInvocationId=CommandInvocation:Test-NodeC-3746:121294, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=InternalMetadataImpl{actual=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=SimpleClusteredVersion{topologyId=7, version=1}}, created=-1, lastUsed=-1}, successful=true, topologyId=-1}
> 21:54:43,316 TRACE (remote-thread-Test-NodeC-p69905-t2:[___defaultcache]) [StateConsumerImpl] After applying the received state the data container of cache ___defaultcache has 1 keys
> 21:54:43,316 TRACE (remote-thread-Test-NodeC-p69905-t2:[___defaultcache]) [StateConsumerImpl] Segments not received yet for cache ___defaultcache: {}
> 21:54:43,316 DEBUG (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Finished receiving of segments for cache ___defaultcache for topology 9.
> 21:54:43,316 DEBUG (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [ScatteredVersionManagerImpl] Node Test-NodeC-3746 received values for all segments in topology 9
> 21:54:43,316 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Stop keeping track of changed keys for state transfer in topology 9
> {noformat}
> The test then starts a put operation and expects it to prefetch the previous value, but because the segment is {{OWNED}}, the {{RemoteMetadata}} is ignored:
> {noformat}
> 21:54:43,316 TRACE (ForkThread-1,Test:[]) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=key, value=v1, flags=[], commandInvocationId=CommandInvocation:Test-NodeC-3746:121295, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true, topologyId=-1} and InvocationContext [SingleKeyNonTxInvocationContext{isLocked=false, key=null, cacheEntry=null, origin=null, lockOwner=CommandInvocation:Test-NodeC-3746:121295}]
> 21:54:43,316 TRACE (ForkThread-1,Test:[]) [EntryFactoryImpl] Retrieved from container MetadataImmortalCacheEntry{key=key, value=null, metadata=RemoteMetadata{address=Test-NodeA-39104, version=1}}
> 21:54:43,316 TRACE (ForkThread-1,Test:[]) [ScatteredDistributionInterceptor] Committing entry RepeatableReadEntry(108d175b){key=key, value=v1, isCreated=false, isChanged=true, isRemoved=false, isExpired=false, skipLookup=true, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=SimpleClusteredVersion{topologyId=9, version=1}}}, replaced MetadataImmortalCacheEntry{key=key, value=null, metadata=RemoteMetadata{address=Test-NodeA-39104, version=1}}
> 21:54:53,316 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.scattered.statetransfer.PrefetchTest.testPrefetch12
> org.infinispan.test.TestException: java.util.concurrent.TimeoutException
> at org.infinispan.util.ControlledRpcManager.uncheckedGet(ControlledRpcManager.java:259) ~[test-classes/:?]
> at org.infinispan.util.ControlledRpcManager.expectCommand(ControlledRpcManager.java:124) ~[test-classes/:?]
> at org.infinispan.scattered.statetransfer.PrefetchTest.testPrefetch(PrefetchTest.java:110) ~[test-classes/:?]
> at org.infinispan.scattered.statetransfer.PrefetchTest.testPrefetch12(PrefetchTest.java:67) ~[test-classes/:?]
> {noformat}
> On a related note, {{StateConsumerImpl.applyState(pushTransfer=true)}} initializes a {{CountDownLatch(stateChunks.size())}}, but doesn't actually count down if {{stateChunk.getCacheEntries() == null}}, potentially hanging state transfer until it times out.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 6 months