Dan Berindei created ISPN-11000:
-----------------------------------
Summary: LocalTopologyManager should not wait for view if the local node is
not a member
Key: ISPN-11000
URL:
https://issues.jboss.org/browse/ISPN-11000
Project: Infinispan
Issue Type: Bug
Components: Core
Affects Versions: 10.1.0.Beta1, 9.4.16.Final
Reporter: Dan Berindei
Assignee: Dan Berindei
Fix For: 10.1.0.CR1
Sometimes a node is excluded from the cluster view but it can still receive multicast
messages like FD_ALL heartbeats and topology updates from the coordinator.
Because it is still receiving heartbeats, the excluded is not becoming coordinator itself
and installing a new view. If MERGE3 doesn't merge the partitions, it could keep the
outdated view for a long time, and {{LocalTopologyManagerImpl}} will block many transport
threads waiting for the right view to process the topology updates that keep coming from
the coordinator:
{noformat}
11:31:01,052 INFO [org.radargun.service.InfinispanRestAPI] (pool-2-thread-1)
CacheManagerInfo{clusterMembers=[edg-perf01-21541, edg-perf02-54831, edg-perf05-28640,
edg-perf03-47882, edg-perf06-47720, edg-perf04-19840, edg-perf07-34498, edg-perf08-52975],
clusterSize=8}
[33m11:31:05,281 WARN [org.jgroups.protocols.pbcast.GMS] (jgroups-79,edg-perf03-47882)
edg-perf03-47882: not member of view [edg-perf01-21541|6]; discarding it
11:31:11,041 INFO [org.radargun.service.InfinispanRestAPI] (pool-2-thread-1)
CacheManagerInfo{clusterMembers=[edg-perf01-21541, edg-perf02-54831, edg-perf05-28640,
edg-perf03-47882, edg-perf06-47720, edg-perf04-19840, edg-perf07-34498, edg-perf08-52975],
clusterSize=8}
[0m[33m11:31:16,267 WARN [org.jgroups.protocols.pbcast.GMS] (jgroups-80,edg-perf03-47882)
edg-perf03-47882: failed to create view from delta-view; dropping view:
java.lang.IllegalStateException: the view-id of the delta view ([edg-perf01-21541|6])
doesn't match the current view-id ([edg-perf01-21541|5]); discarding delta view
[edg-perf01-21541|7], ref-view=[edg-perf01-21541|6], left=[edg-perf06-47720]
[0m[33m11:31:16,274 WARN [org.jgroups.protocols.pbcast.GMS] (jgroups-80,edg-perf03-47882)
edg-perf03-47882: not member of view [edg-perf01-21541|7]; discarding it
11:31:21,035 INFO [org.radargun.service.InfinispanRestAPI] (pool-2-thread-1)
CacheManagerInfo{clusterMembers=[edg-perf01-21541, edg-perf02-54831, edg-perf05-28640,
edg-perf03-47882, edg-perf06-47720, edg-perf04-19840, edg-perf07-34498, edg-perf08-52975],
clusterSize=8}
11:31:31,040 INFO [org.radargun.service.InfinispanRestAPI] (pool-2-thread-1)
CacheManagerInfo{clusterMembers=[edg-perf01-21541, edg-perf02-54831, edg-perf05-28640,
edg-perf03-47882, edg-perf06-47720, edg-perf04-19840, edg-perf07-34498, edg-perf08-52975],
clusterSize=8}
11:31:41,047 INFO [org.radargun.service.InfinispanRestAPI] (pool-2-thread-1)
CacheManagerInfo{clusterMembers=[edg-perf01-21541, edg-perf02-54831, edg-perf05-28640,
edg-perf03-47882, edg-perf06-47720, edg-perf04-19840, edg-perf07-34498, edg-perf08-52975],
clusterSize=8}
11:31:51,033 INFO [org.radargun.service.InfinispanRestAPI] (pool-2-thread-1)
CacheManagerInfo{clusterMembers=[edg-perf01-21541, edg-perf02-54831, edg-perf05-28640,
edg-perf03-47882, edg-perf06-47720, edg-perf04-19840, edg-perf07-34498, edg-perf08-52975],
clusterSize=8}
11:32:01,035 INFO [org.radargun.service.InfinispanRestAPI] (pool-2-thread-1)
CacheManagerInfo{clusterMembers=[edg-perf01-21541, edg-perf02-54831, edg-perf05-28640,
edg-perf03-47882, edg-perf06-47720, edg-perf04-19840, edg-perf07-34498, edg-perf08-52975],
clusterSize=8}
[0m[33m11:32:03,051 WARN [org.jgroups.protocols.pbcast.GMS] (jgroups-80,edg-perf03-47882)
edg-perf03-47882: failed to create view from delta-view; dropping view:
java.lang.IllegalStateException: the view-id of the delta view ([edg-perf01-21541|7])
doesn't match the current view-id ([edg-perf01-21541|5]); discarding delta view
[edg-perf01-21541|8], ref-view=[edg-perf01-21541|7], left=[edg-perf04-19840]
[0m[33m11:32:03,063 WARN [org.jgroups.protocols.pbcast.GMS] (jgroups-80,edg-perf03-47882)
edg-perf03-47882: not member of view [edg-perf01-21541|8]; discarding it
[0m[31m11:32:05,321 ERROR [org.infinispan.topology.LocalTopologyManagerImpl]
(transport-thread--p5-t5) ISPN000452: Failed to update topology for cache memcachedCache:
org.infinispan.util.concurrent.TimeoutException: ISPN000451: Timed out waiting for view 6,
current view is 5
at
org.infinispan.topology.LocalTopologyManagerImpl.waitForView(LocalTopologyManagerImpl.java:571)
at
org.infinispan.topology.LocalTopologyManagerImpl.doHandleTopologyUpdate(LocalTopologyManagerImpl.java:302)
at
org.infinispan.topology.LocalTopologyManagerImpl.lambda$handleTopologyUpdate$1(LocalTopologyManagerImpl.java:286)
at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175)
at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:37)
at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:227)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
{noformat}
{{LocalTopologyManagerImpl.doHandleTopologyUpdate()}} could check if the local node is a
member of the new topology first, avoid blocking, and avoid logging an error message.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)