[infinispan-issues] [JBoss JIRA] (ISPN-9517) State transfer times out if initiated with yet to be verified suspected member and reincarnated member

Tristan Tarrant (JIRA) issues at jboss.org
Thu Sep 27 09:14:00 EDT 2018


     [ https://issues.jboss.org/browse/ISPN-9517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tristan Tarrant updated ISPN-9517:
----------------------------------
    Fix Version/s: 9.4.0.Final


> State transfer times out if initiated with yet to be verified suspected member and reincarnated member
> ------------------------------------------------------------------------------------------------------
>
>                 Key: ISPN-9517
>                 URL: https://issues.jboss.org/browse/ISPN-9517
>             Project: Infinispan
>          Issue Type: Bug
>          Components: State Transfer
>    Affects Versions: 9.3.3.Final
>            Reporter: Paul Ferraro
>            Assignee: Dan Berindei
>             Fix For: 9.4.0.Final
>
>         Attachments: Test.java, node-1.zip, node-2.zip
>
>
> Here's the scenario:
> 1. Cluster contains caches on 2 members, node-1 and node-2
> 2. node-2 is killed
> 3. node-2 is restarted (using same physical address)
> 4. State transfer initiates, view contains node-1, suspected node-2, and reincarnated node-2
> 5. State transfer times out
> Log of node-1 includes:
> {noformat}
> 12:09:51,882 WARN  [org.infinispan.topology.ClusterTopologyManagerImpl] (transport-thread--p14-t4) ISPN000197: Error updating cluster member list: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 3 from node-2
> 	at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
> 	at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
> 	at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_181]
> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [rt.jar:1.8.0_181]
> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [rt.jar:1.8.0_181]
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [rt.jar:1.8.0_181]
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [rt.jar:1.8.0_181]
> 	at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_181]
> 	Suppressed: org.infinispan.util.logging.TraceException
> 		at org.infinispan.remoting.transport.Transport.invokeRemotely(Transport.java:75)
> 		at org.infinispan.topology.ClusterTopologyManagerImpl.confirmMembersAvailable(ClusterTopologyManagerImpl.java:525)
> 		at org.infinispan.topology.ClusterTopologyManagerImpl.updateCacheMembers(ClusterTopologyManagerImpl.java:508)
> 		at org.infinispan.topology.ClusterTopologyManagerImpl.handleClusterView(ClusterTopologyManagerImpl.java:321)
> 		at org.infinispan.topology.ClusterTopologyManagerImpl.access$500(ClusterTopologyManagerImpl.java:87)
> 		at org.infinispan.topology.ClusterTopologyManagerImpl$ClusterViewListener.lambda$handleViewChange$0(ClusterTopologyManagerImpl.java:731)
> 		at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175)
> 		at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:37)
> 		at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:227)
> 		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [rt.jar:1.8.0_181]
> 		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [rt.jar:1.8.0_181]
> 		at org.wildfly.clustering.service.concurrent.ClassLoaderThreadFactory.lambda$newThread$0(ClassLoaderThreadFactory.java:47)
> 		... 1 more
> {noformat}
> I've attached trace logs from node-1 and node-2.
> Changing ClusterTopologyManagerImpl.confirmMembersAvailable() to use ResponseMode.SYNCHRONOUS_IGNORE_LEAVERS instead of ResponseMode.SYNCHRONOUS allows state transfer to complete successfully.



--
This message was sent by Atlassian JIRA
(v7.5.0#75005)


More information about the infinispan-issues mailing list