[infinispan-issues] [JBoss JIRA] (ISPN-11000) LocalTopologyManager should not wait for view if the local node is not a member

Dan Berindei (Jira) issues at jboss.org
Thu Nov 28 05:46:00 EST 2019


Dan Berindei created ISPN-11000:
-----------------------------------

             Summary: LocalTopologyManager should not wait for view if the local node is not a member
                 Key: ISPN-11000
                 URL: https://issues.jboss.org/browse/ISPN-11000
             Project: Infinispan
          Issue Type: Bug
          Components: Core
    Affects Versions: 10.1.0.Beta1, 9.4.16.Final
            Reporter: Dan Berindei
            Assignee: Dan Berindei
             Fix For: 10.1.0.CR1


Sometimes a node is excluded from the cluster view but it can still receive multicast messages like FD_ALL heartbeats and topology updates from the coordinator.

Because it is still receiving heartbeats, the excluded is not becoming coordinator itself and installing a new view. If MERGE3 doesn't merge the partitions, it could keep the outdated view for a long time, and {{LocalTopologyManagerImpl}} will block many transport threads waiting for the right view to process the topology updates that keep coming from the coordinator:

{noformat}
11:31:01,052 INFO  [org.radargun.service.InfinispanRestAPI] (pool-2-thread-1) CacheManagerInfo{clusterMembers=[edg-perf01-21541, edg-perf02-54831, edg-perf05-28640, edg-perf03-47882, edg-perf06-47720, edg-perf04-19840, edg-perf07-34498, edg-perf08-52975], clusterSize=8}
[33m11:31:05,281 WARN  [org.jgroups.protocols.pbcast.GMS] (jgroups-79,edg-perf03-47882) edg-perf03-47882: not member of view [edg-perf01-21541|6]; discarding it
11:31:11,041 INFO  [org.radargun.service.InfinispanRestAPI] (pool-2-thread-1) CacheManagerInfo{clusterMembers=[edg-perf01-21541, edg-perf02-54831, edg-perf05-28640, edg-perf03-47882, edg-perf06-47720, edg-perf04-19840, edg-perf07-34498, edg-perf08-52975], clusterSize=8}
[0m[33m11:31:16,267 WARN  [org.jgroups.protocols.pbcast.GMS] (jgroups-80,edg-perf03-47882) edg-perf03-47882: failed to create view from delta-view; dropping view: java.lang.IllegalStateException: the view-id of the delta view ([edg-perf01-21541|6]) doesn't match the current view-id ([edg-perf01-21541|5]); discarding delta view [edg-perf01-21541|7], ref-view=[edg-perf01-21541|6], left=[edg-perf06-47720]
[0m[33m11:31:16,274 WARN  [org.jgroups.protocols.pbcast.GMS] (jgroups-80,edg-perf03-47882) edg-perf03-47882: not member of view [edg-perf01-21541|7]; discarding it
11:31:21,035 INFO  [org.radargun.service.InfinispanRestAPI] (pool-2-thread-1) CacheManagerInfo{clusterMembers=[edg-perf01-21541, edg-perf02-54831, edg-perf05-28640, edg-perf03-47882, edg-perf06-47720, edg-perf04-19840, edg-perf07-34498, edg-perf08-52975], clusterSize=8}
11:31:31,040 INFO  [org.radargun.service.InfinispanRestAPI] (pool-2-thread-1) CacheManagerInfo{clusterMembers=[edg-perf01-21541, edg-perf02-54831, edg-perf05-28640, edg-perf03-47882, edg-perf06-47720, edg-perf04-19840, edg-perf07-34498, edg-perf08-52975], clusterSize=8}
11:31:41,047 INFO  [org.radargun.service.InfinispanRestAPI] (pool-2-thread-1) CacheManagerInfo{clusterMembers=[edg-perf01-21541, edg-perf02-54831, edg-perf05-28640, edg-perf03-47882, edg-perf06-47720, edg-perf04-19840, edg-perf07-34498, edg-perf08-52975], clusterSize=8}
11:31:51,033 INFO  [org.radargun.service.InfinispanRestAPI] (pool-2-thread-1) CacheManagerInfo{clusterMembers=[edg-perf01-21541, edg-perf02-54831, edg-perf05-28640, edg-perf03-47882, edg-perf06-47720, edg-perf04-19840, edg-perf07-34498, edg-perf08-52975], clusterSize=8}
11:32:01,035 INFO  [org.radargun.service.InfinispanRestAPI] (pool-2-thread-1) CacheManagerInfo{clusterMembers=[edg-perf01-21541, edg-perf02-54831, edg-perf05-28640, edg-perf03-47882, edg-perf06-47720, edg-perf04-19840, edg-perf07-34498, edg-perf08-52975], clusterSize=8}
[0m[33m11:32:03,051 WARN  [org.jgroups.protocols.pbcast.GMS] (jgroups-80,edg-perf03-47882) edg-perf03-47882: failed to create view from delta-view; dropping view: java.lang.IllegalStateException: the view-id of the delta view ([edg-perf01-21541|7]) doesn't match the current view-id ([edg-perf01-21541|5]); discarding delta view [edg-perf01-21541|8], ref-view=[edg-perf01-21541|7], left=[edg-perf04-19840]
[0m[33m11:32:03,063 WARN  [org.jgroups.protocols.pbcast.GMS] (jgroups-80,edg-perf03-47882) edg-perf03-47882: not member of view [edg-perf01-21541|8]; discarding it
[0m[31m11:32:05,321 ERROR [org.infinispan.topology.LocalTopologyManagerImpl] (transport-thread--p5-t5) ISPN000452: Failed to update topology for cache memcachedCache: org.infinispan.util.concurrent.TimeoutException: ISPN000451: Timed out waiting for view 6, current view is 5
	at org.infinispan.topology.LocalTopologyManagerImpl.waitForView(LocalTopologyManagerImpl.java:571)
	at org.infinispan.topology.LocalTopologyManagerImpl.doHandleTopologyUpdate(LocalTopologyManagerImpl.java:302)
	at org.infinispan.topology.LocalTopologyManagerImpl.lambda$handleTopologyUpdate$1(LocalTopologyManagerImpl.java:286)
	at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175)
	at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:37)
	at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:227)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)
{noformat}

{{LocalTopologyManagerImpl.doHandleTopologyUpdate()}} could check if the local node is a member of the new topology first, avoid blocking, and avoid logging an error message.



--
This message was sent by Atlassian Jira
(v7.13.8#713008)


More information about the infinispan-issues mailing list