[JBoss JIRA] (WFLY-10736) Server in cluster hangs during start after previous kill
by Miroslav Novak (JIRA)
[ https://issues.jboss.org/browse/WFLY-10736?page=com.atlassian.jira.plugin... ]
Miroslav Novak commented on WFLY-10736:
---------------------------------------
[~pferraro] I've attached zip with server logs from node-1 and node-2. Unzip logs.zip file in node-1-log and node-2-log. There are 2 files - server.log (INFO level) and server-trace.log (with TRACE for JGroups and Infinispan) in it.
> Server in cluster hangs during start after previous kill
> --------------------------------------------------------
>
> Key: WFLY-10736
> URL: https://issues.jboss.org/browse/WFLY-10736
> Project: WildFly
> Issue Type: Bug
> Components: Clustering
> Reporter: Miroslav Novak
> Assignee: Paul Ferraro
> Priority: Blocker
> Labels: blocker-WF14
> Attachments: clusterKilTest.zip, node-1-thread-dump-before-kill-shutdown-sequence.txt, standalone-full-ha-1.xml, standalone-full-ha-2.xml
>
>
> There is regression in JGroups or Infinispan in one of our tests for fault tolerance of JMS bridges. However work on JMS bridge appears to be unrelated. Issue was hit in WF weekly run.
> Test Scenario:
> * There are two servers. InQueue is deployed on Node 1,
> * OutQueue is deployed on Node 2. Both servers are started.
> * Large byte messages are sent to InQueue deployed on Node 1. Bridge between servers/queues transfers messages from node 1 to node 2.
> * Node 1 is killed and started again.
> * All messages are received from OutQueue deployed on Node 2.
> Result:
> Node 1 does not start after kill and hangs. There is following exception logged in node 2:
> {code}
> :26:17,894 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN100000: Node node-1 joined the cluster
> 09:26:18,520 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN000094: Received new cluster view for channel ejb: [node-2|7] (2) [node-2, node-1]
> 09:26:18,521 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN100001: Node node-1 left the cluster
> 09:26:18,521 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN000094: Received new cluster view for channel ejb: [node-2|7] (2) [node-2, node-1]
> 09:26:18,522 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN100001: Node node-1 left the cluster
> 09:26:18,522 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN000094: Received new cluster view for channel ejb: [node-2|7] (2) [node-2, node-1]
> 09:26:18,522 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN100001: Node node-1 left the cluster
> 09:26:18,522 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN000094: Received new cluster view for channel ejb: [node-2|7] (2) [node-2, node-1]
> 09:26:18,523 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN100001: Node node-1 left the cluster
> 09:26:18,868 INFO [org.infinispan.CLUSTER] (remote-thread--p5-t2) ISPN000310: Starting cluster-wide rebalance for cache default, topology CacheTopology{id=17, phase=READ_OLD_WRITE_ALL, rebalanceId=6, currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[node-2: 122, node-1: 134]}, pendingCH=ReplicatedConsistentHash{ns = 256, owners = (3)[node-2: 84, node-1: 90, node-1: 82]}, unionCH=null, actualMembers=[node-2, node-1, node-1], persistentUUIDs=[12443bfb-e88a-46f3-919e-9213bf38ce19, 2873237f-d881-463f-8a5a-940bf1d764e5, a05ea8af-a83b-42a9-b937-dc2da1cae6d1]}
> 09:26:18,869 INFO [org.infinispan.CLUSTER] (remote-thread--p5-t2) [Context=default][Scope=node-2]ISPN100002: Started rebalance with topology id 17
> 09:26:18,870 INFO [org.infinispan.CLUSTER] (transport-thread--p14-t5) [Context=default][Scope=node-2]ISPN100003: Node node-2 finished rebalance phase with topology id 17
> 09:26:18,981 INFO [org.infinispan.CLUSTER] (remote-thread--p5-t2) [Context=default][Scope=node-1]ISPN100003: Node node-1 finished rebalance phase with topology id 17
> 09:27:18,530 WARN [org.infinispan.topology.ClusterTopologyManagerImpl] (transport-thread--p15-t4) ISPN000197: Error updating cluster member list: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_131]
> Suppressed: java.util.concurrent.ExecutionException: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) [rt.jar:1.8.0_131]
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) [rt.jar:1.8.0_131]
> at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:82)
> at org.infinispan.remoting.transport.Transport.invokeRemotely(Transport.java:71)
> at org.infinispan.topology.ClusterTopologyManagerImpl.confirmMembersAvailable(ClusterTopologyManagerImpl.java:540)
> at org.infinispan.topology.ClusterTopologyManagerImpl.updateCacheMembers(ClusterTopologyManagerImpl.java:523)
> at org.infinispan.topology.ClusterTopologyManagerImpl.handleClusterView(ClusterTopologyManagerImpl.java:334)
> at org.infinispan.topology.ClusterTopologyManagerImpl.access$500(ClusterTopologyManagerImpl.java:85)
> at org.infinispan.topology.ClusterTopologyManagerImpl$ClusterViewListener.lambda$handleViewChange$0(ClusterTopologyManagerImpl.java:745)
> at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:144)
> at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:33)
> at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:174)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> at org.wildfly.clustering.service.concurrent.ClassLoaderThreadFactory.lambda$newThread$0(ClassLoaderThreadFactory.java:47)
> ... 1 more
> Caused by: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> ... 1 more
> [CIRCULAR REFERENCE:java.util.concurrent.ExecutionException: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1]
> 09:27:18,530 WARN [org.infinispan.topology.ClusterTopologyManagerImpl] (transport-thread--p16-t4) ISPN000197: Error updating cluster member list: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_131]
> Suppressed: java.util.concurrent.ExecutionException: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) [rt.jar:1.8.0_131]
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) [rt.jar:1.8.0_131]
> at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:82)
> at org.infinispan.remoting.transport.Transport.invokeRemotely(Transport.java:71)
> at org.infinispan.topology.ClusterTopologyManagerImpl.confirmMembersAvailable(ClusterTopologyManagerImpl.java:540)
> at org.infinispan.topology.ClusterTopologyManagerImpl.updateCacheMembers(ClusterTopologyManagerImpl.java:523)
> at org.infinispan.topology.ClusterTopologyManagerImpl.handleClusterView(ClusterTopologyManagerImpl.java:334)
> at org.infinispan.topology.ClusterTopologyManagerImpl.access$500(ClusterTopologyManagerImpl.java:85)
> at org.infinispan.topology.ClusterTopologyManagerImpl$ClusterViewListener.lambda$handleViewChange$0(ClusterTopologyManagerImpl.java:745)
> at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:144)
> at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:33)
> at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:174)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> at org.wildfly.clustering.service.concurrent.ClassLoaderThreadFactory.lambda$newThread$0(ClassLoaderThreadFactory.java:47)
> ... 1 more
> Caused by: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> ... 1 more
> [CIRCULAR REFERENCE:java.util.concurrent.ExecutionException: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1]
> {code}
> There is default JGroups udp stack configured which is used by Infinispan. Both of the servers (jgroups udp) are bound to 127.0.0.1. Node 2 has port offset 1000.
> Attaching thread dump from node 1 when it hangs during start.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 9 months
[JBoss JIRA] (WFLY-10736) Server in cluster hangs during start after previous kill
by Miroslav Novak (JIRA)
[ https://issues.jboss.org/browse/WFLY-10736?page=com.atlassian.jira.plugin... ]
Miroslav Novak updated WFLY-10736:
----------------------------------
Attachment: clusterKilTest.zip
> Server in cluster hangs during start after previous kill
> --------------------------------------------------------
>
> Key: WFLY-10736
> URL: https://issues.jboss.org/browse/WFLY-10736
> Project: WildFly
> Issue Type: Bug
> Components: Clustering
> Reporter: Miroslav Novak
> Assignee: Paul Ferraro
> Priority: Blocker
> Labels: blocker-WF14
> Attachments: clusterKilTest.zip, node-1-thread-dump-before-kill-shutdown-sequence.txt, standalone-full-ha-1.xml, standalone-full-ha-2.xml
>
>
> There is regression in JGroups or Infinispan in one of our tests for fault tolerance of JMS bridges. However work on JMS bridge appears to be unrelated. Issue was hit in WF weekly run.
> Test Scenario:
> * There are two servers. InQueue is deployed on Node 1,
> * OutQueue is deployed on Node 2. Both servers are started.
> * Large byte messages are sent to InQueue deployed on Node 1. Bridge between servers/queues transfers messages from node 1 to node 2.
> * Node 1 is killed and started again.
> * All messages are received from OutQueue deployed on Node 2.
> Result:
> Node 1 does not start after kill and hangs. There is following exception logged in node 2:
> {code}
> :26:17,894 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN100000: Node node-1 joined the cluster
> 09:26:18,520 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN000094: Received new cluster view for channel ejb: [node-2|7] (2) [node-2, node-1]
> 09:26:18,521 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN100001: Node node-1 left the cluster
> 09:26:18,521 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN000094: Received new cluster view for channel ejb: [node-2|7] (2) [node-2, node-1]
> 09:26:18,522 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN100001: Node node-1 left the cluster
> 09:26:18,522 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN000094: Received new cluster view for channel ejb: [node-2|7] (2) [node-2, node-1]
> 09:26:18,522 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN100001: Node node-1 left the cluster
> 09:26:18,522 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN000094: Received new cluster view for channel ejb: [node-2|7] (2) [node-2, node-1]
> 09:26:18,523 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN100001: Node node-1 left the cluster
> 09:26:18,868 INFO [org.infinispan.CLUSTER] (remote-thread--p5-t2) ISPN000310: Starting cluster-wide rebalance for cache default, topology CacheTopology{id=17, phase=READ_OLD_WRITE_ALL, rebalanceId=6, currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[node-2: 122, node-1: 134]}, pendingCH=ReplicatedConsistentHash{ns = 256, owners = (3)[node-2: 84, node-1: 90, node-1: 82]}, unionCH=null, actualMembers=[node-2, node-1, node-1], persistentUUIDs=[12443bfb-e88a-46f3-919e-9213bf38ce19, 2873237f-d881-463f-8a5a-940bf1d764e5, a05ea8af-a83b-42a9-b937-dc2da1cae6d1]}
> 09:26:18,869 INFO [org.infinispan.CLUSTER] (remote-thread--p5-t2) [Context=default][Scope=node-2]ISPN100002: Started rebalance with topology id 17
> 09:26:18,870 INFO [org.infinispan.CLUSTER] (transport-thread--p14-t5) [Context=default][Scope=node-2]ISPN100003: Node node-2 finished rebalance phase with topology id 17
> 09:26:18,981 INFO [org.infinispan.CLUSTER] (remote-thread--p5-t2) [Context=default][Scope=node-1]ISPN100003: Node node-1 finished rebalance phase with topology id 17
> 09:27:18,530 WARN [org.infinispan.topology.ClusterTopologyManagerImpl] (transport-thread--p15-t4) ISPN000197: Error updating cluster member list: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_131]
> Suppressed: java.util.concurrent.ExecutionException: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) [rt.jar:1.8.0_131]
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) [rt.jar:1.8.0_131]
> at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:82)
> at org.infinispan.remoting.transport.Transport.invokeRemotely(Transport.java:71)
> at org.infinispan.topology.ClusterTopologyManagerImpl.confirmMembersAvailable(ClusterTopologyManagerImpl.java:540)
> at org.infinispan.topology.ClusterTopologyManagerImpl.updateCacheMembers(ClusterTopologyManagerImpl.java:523)
> at org.infinispan.topology.ClusterTopologyManagerImpl.handleClusterView(ClusterTopologyManagerImpl.java:334)
> at org.infinispan.topology.ClusterTopologyManagerImpl.access$500(ClusterTopologyManagerImpl.java:85)
> at org.infinispan.topology.ClusterTopologyManagerImpl$ClusterViewListener.lambda$handleViewChange$0(ClusterTopologyManagerImpl.java:745)
> at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:144)
> at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:33)
> at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:174)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> at org.wildfly.clustering.service.concurrent.ClassLoaderThreadFactory.lambda$newThread$0(ClassLoaderThreadFactory.java:47)
> ... 1 more
> Caused by: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> ... 1 more
> [CIRCULAR REFERENCE:java.util.concurrent.ExecutionException: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1]
> 09:27:18,530 WARN [org.infinispan.topology.ClusterTopologyManagerImpl] (transport-thread--p16-t4) ISPN000197: Error updating cluster member list: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_131]
> Suppressed: java.util.concurrent.ExecutionException: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) [rt.jar:1.8.0_131]
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) [rt.jar:1.8.0_131]
> at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:82)
> at org.infinispan.remoting.transport.Transport.invokeRemotely(Transport.java:71)
> at org.infinispan.topology.ClusterTopologyManagerImpl.confirmMembersAvailable(ClusterTopologyManagerImpl.java:540)
> at org.infinispan.topology.ClusterTopologyManagerImpl.updateCacheMembers(ClusterTopologyManagerImpl.java:523)
> at org.infinispan.topology.ClusterTopologyManagerImpl.handleClusterView(ClusterTopologyManagerImpl.java:334)
> at org.infinispan.topology.ClusterTopologyManagerImpl.access$500(ClusterTopologyManagerImpl.java:85)
> at org.infinispan.topology.ClusterTopologyManagerImpl$ClusterViewListener.lambda$handleViewChange$0(ClusterTopologyManagerImpl.java:745)
> at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:144)
> at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:33)
> at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:174)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> at org.wildfly.clustering.service.concurrent.ClassLoaderThreadFactory.lambda$newThread$0(ClassLoaderThreadFactory.java:47)
> ... 1 more
> Caused by: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> ... 1 more
> [CIRCULAR REFERENCE:java.util.concurrent.ExecutionException: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1]
> {code}
> There is default JGroups udp stack configured which is used by Infinispan. Both of the servers (jgroups udp) are bound to 127.0.0.1. Node 2 has port offset 1000.
> Attaching thread dump from node 1 when it hangs during start.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 9 months
[JBoss JIRA] (WFLY-10736) Server in cluster hangs during start after previous kill
by Miroslav Novak (JIRA)
[ https://issues.jboss.org/browse/WFLY-10736?page=com.atlassian.jira.plugin... ]
Miroslav Novak updated WFLY-10736:
----------------------------------
Attachment: (was: logs.zip)
> Server in cluster hangs during start after previous kill
> --------------------------------------------------------
>
> Key: WFLY-10736
> URL: https://issues.jboss.org/browse/WFLY-10736
> Project: WildFly
> Issue Type: Bug
> Components: Clustering
> Reporter: Miroslav Novak
> Assignee: Paul Ferraro
> Priority: Blocker
> Labels: blocker-WF14
> Attachments: node-1-thread-dump-before-kill-shutdown-sequence.txt, standalone-full-ha-1.xml, standalone-full-ha-2.xml
>
>
> There is regression in JGroups or Infinispan in one of our tests for fault tolerance of JMS bridges. However work on JMS bridge appears to be unrelated. Issue was hit in WF weekly run.
> Test Scenario:
> * There are two servers. InQueue is deployed on Node 1,
> * OutQueue is deployed on Node 2. Both servers are started.
> * Large byte messages are sent to InQueue deployed on Node 1. Bridge between servers/queues transfers messages from node 1 to node 2.
> * Node 1 is killed and started again.
> * All messages are received from OutQueue deployed on Node 2.
> Result:
> Node 1 does not start after kill and hangs. There is following exception logged in node 2:
> {code}
> :26:17,894 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN100000: Node node-1 joined the cluster
> 09:26:18,520 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN000094: Received new cluster view for channel ejb: [node-2|7] (2) [node-2, node-1]
> 09:26:18,521 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN100001: Node node-1 left the cluster
> 09:26:18,521 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN000094: Received new cluster view for channel ejb: [node-2|7] (2) [node-2, node-1]
> 09:26:18,522 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN100001: Node node-1 left the cluster
> 09:26:18,522 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN000094: Received new cluster view for channel ejb: [node-2|7] (2) [node-2, node-1]
> 09:26:18,522 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN100001: Node node-1 left the cluster
> 09:26:18,522 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN000094: Received new cluster view for channel ejb: [node-2|7] (2) [node-2, node-1]
> 09:26:18,523 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN100001: Node node-1 left the cluster
> 09:26:18,868 INFO [org.infinispan.CLUSTER] (remote-thread--p5-t2) ISPN000310: Starting cluster-wide rebalance for cache default, topology CacheTopology{id=17, phase=READ_OLD_WRITE_ALL, rebalanceId=6, currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[node-2: 122, node-1: 134]}, pendingCH=ReplicatedConsistentHash{ns = 256, owners = (3)[node-2: 84, node-1: 90, node-1: 82]}, unionCH=null, actualMembers=[node-2, node-1, node-1], persistentUUIDs=[12443bfb-e88a-46f3-919e-9213bf38ce19, 2873237f-d881-463f-8a5a-940bf1d764e5, a05ea8af-a83b-42a9-b937-dc2da1cae6d1]}
> 09:26:18,869 INFO [org.infinispan.CLUSTER] (remote-thread--p5-t2) [Context=default][Scope=node-2]ISPN100002: Started rebalance with topology id 17
> 09:26:18,870 INFO [org.infinispan.CLUSTER] (transport-thread--p14-t5) [Context=default][Scope=node-2]ISPN100003: Node node-2 finished rebalance phase with topology id 17
> 09:26:18,981 INFO [org.infinispan.CLUSTER] (remote-thread--p5-t2) [Context=default][Scope=node-1]ISPN100003: Node node-1 finished rebalance phase with topology id 17
> 09:27:18,530 WARN [org.infinispan.topology.ClusterTopologyManagerImpl] (transport-thread--p15-t4) ISPN000197: Error updating cluster member list: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_131]
> Suppressed: java.util.concurrent.ExecutionException: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) [rt.jar:1.8.0_131]
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) [rt.jar:1.8.0_131]
> at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:82)
> at org.infinispan.remoting.transport.Transport.invokeRemotely(Transport.java:71)
> at org.infinispan.topology.ClusterTopologyManagerImpl.confirmMembersAvailable(ClusterTopologyManagerImpl.java:540)
> at org.infinispan.topology.ClusterTopologyManagerImpl.updateCacheMembers(ClusterTopologyManagerImpl.java:523)
> at org.infinispan.topology.ClusterTopologyManagerImpl.handleClusterView(ClusterTopologyManagerImpl.java:334)
> at org.infinispan.topology.ClusterTopologyManagerImpl.access$500(ClusterTopologyManagerImpl.java:85)
> at org.infinispan.topology.ClusterTopologyManagerImpl$ClusterViewListener.lambda$handleViewChange$0(ClusterTopologyManagerImpl.java:745)
> at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:144)
> at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:33)
> at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:174)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> at org.wildfly.clustering.service.concurrent.ClassLoaderThreadFactory.lambda$newThread$0(ClassLoaderThreadFactory.java:47)
> ... 1 more
> Caused by: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> ... 1 more
> [CIRCULAR REFERENCE:java.util.concurrent.ExecutionException: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1]
> 09:27:18,530 WARN [org.infinispan.topology.ClusterTopologyManagerImpl] (transport-thread--p16-t4) ISPN000197: Error updating cluster member list: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_131]
> Suppressed: java.util.concurrent.ExecutionException: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) [rt.jar:1.8.0_131]
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) [rt.jar:1.8.0_131]
> at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:82)
> at org.infinispan.remoting.transport.Transport.invokeRemotely(Transport.java:71)
> at org.infinispan.topology.ClusterTopologyManagerImpl.confirmMembersAvailable(ClusterTopologyManagerImpl.java:540)
> at org.infinispan.topology.ClusterTopologyManagerImpl.updateCacheMembers(ClusterTopologyManagerImpl.java:523)
> at org.infinispan.topology.ClusterTopologyManagerImpl.handleClusterView(ClusterTopologyManagerImpl.java:334)
> at org.infinispan.topology.ClusterTopologyManagerImpl.access$500(ClusterTopologyManagerImpl.java:85)
> at org.infinispan.topology.ClusterTopologyManagerImpl$ClusterViewListener.lambda$handleViewChange$0(ClusterTopologyManagerImpl.java:745)
> at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:144)
> at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:33)
> at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:174)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> at org.wildfly.clustering.service.concurrent.ClassLoaderThreadFactory.lambda$newThread$0(ClassLoaderThreadFactory.java:47)
> ... 1 more
> Caused by: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> ... 1 more
> [CIRCULAR REFERENCE:java.util.concurrent.ExecutionException: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1]
> {code}
> There is default JGroups udp stack configured which is used by Infinispan. Both of the servers (jgroups udp) are bound to 127.0.0.1. Node 2 has port offset 1000.
> Attaching thread dump from node 1 when it hangs during start.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 9 months
[JBoss JIRA] (WFLY-10736) Server in cluster hangs during start after previous kill
by Miroslav Novak (JIRA)
[ https://issues.jboss.org/browse/WFLY-10736?page=com.atlassian.jira.plugin... ]
Miroslav Novak updated WFLY-10736:
----------------------------------
Attachment: logs.zip
> Server in cluster hangs during start after previous kill
> --------------------------------------------------------
>
> Key: WFLY-10736
> URL: https://issues.jboss.org/browse/WFLY-10736
> Project: WildFly
> Issue Type: Bug
> Components: Clustering
> Reporter: Miroslav Novak
> Assignee: Paul Ferraro
> Priority: Blocker
> Labels: blocker-WF14
> Attachments: node-1-thread-dump-before-kill-shutdown-sequence.txt, standalone-full-ha-1.xml, standalone-full-ha-2.xml
>
>
> There is regression in JGroups or Infinispan in one of our tests for fault tolerance of JMS bridges. However work on JMS bridge appears to be unrelated. Issue was hit in WF weekly run.
> Test Scenario:
> * There are two servers. InQueue is deployed on Node 1,
> * OutQueue is deployed on Node 2. Both servers are started.
> * Large byte messages are sent to InQueue deployed on Node 1. Bridge between servers/queues transfers messages from node 1 to node 2.
> * Node 1 is killed and started again.
> * All messages are received from OutQueue deployed on Node 2.
> Result:
> Node 1 does not start after kill and hangs. There is following exception logged in node 2:
> {code}
> :26:17,894 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN100000: Node node-1 joined the cluster
> 09:26:18,520 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN000094: Received new cluster view for channel ejb: [node-2|7] (2) [node-2, node-1]
> 09:26:18,521 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN100001: Node node-1 left the cluster
> 09:26:18,521 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN000094: Received new cluster view for channel ejb: [node-2|7] (2) [node-2, node-1]
> 09:26:18,522 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN100001: Node node-1 left the cluster
> 09:26:18,522 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN000094: Received new cluster view for channel ejb: [node-2|7] (2) [node-2, node-1]
> 09:26:18,522 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN100001: Node node-1 left the cluster
> 09:26:18,522 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN000094: Received new cluster view for channel ejb: [node-2|7] (2) [node-2, node-1]
> 09:26:18,523 INFO [org.infinispan.CLUSTER] (thread-12,ejb,node-2) ISPN100001: Node node-1 left the cluster
> 09:26:18,868 INFO [org.infinispan.CLUSTER] (remote-thread--p5-t2) ISPN000310: Starting cluster-wide rebalance for cache default, topology CacheTopology{id=17, phase=READ_OLD_WRITE_ALL, rebalanceId=6, currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[node-2: 122, node-1: 134]}, pendingCH=ReplicatedConsistentHash{ns = 256, owners = (3)[node-2: 84, node-1: 90, node-1: 82]}, unionCH=null, actualMembers=[node-2, node-1, node-1], persistentUUIDs=[12443bfb-e88a-46f3-919e-9213bf38ce19, 2873237f-d881-463f-8a5a-940bf1d764e5, a05ea8af-a83b-42a9-b937-dc2da1cae6d1]}
> 09:26:18,869 INFO [org.infinispan.CLUSTER] (remote-thread--p5-t2) [Context=default][Scope=node-2]ISPN100002: Started rebalance with topology id 17
> 09:26:18,870 INFO [org.infinispan.CLUSTER] (transport-thread--p14-t5) [Context=default][Scope=node-2]ISPN100003: Node node-2 finished rebalance phase with topology id 17
> 09:26:18,981 INFO [org.infinispan.CLUSTER] (remote-thread--p5-t2) [Context=default][Scope=node-1]ISPN100003: Node node-1 finished rebalance phase with topology id 17
> 09:27:18,530 WARN [org.infinispan.topology.ClusterTopologyManagerImpl] (transport-thread--p15-t4) ISPN000197: Error updating cluster member list: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_131]
> Suppressed: java.util.concurrent.ExecutionException: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) [rt.jar:1.8.0_131]
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) [rt.jar:1.8.0_131]
> at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:82)
> at org.infinispan.remoting.transport.Transport.invokeRemotely(Transport.java:71)
> at org.infinispan.topology.ClusterTopologyManagerImpl.confirmMembersAvailable(ClusterTopologyManagerImpl.java:540)
> at org.infinispan.topology.ClusterTopologyManagerImpl.updateCacheMembers(ClusterTopologyManagerImpl.java:523)
> at org.infinispan.topology.ClusterTopologyManagerImpl.handleClusterView(ClusterTopologyManagerImpl.java:334)
> at org.infinispan.topology.ClusterTopologyManagerImpl.access$500(ClusterTopologyManagerImpl.java:85)
> at org.infinispan.topology.ClusterTopologyManagerImpl$ClusterViewListener.lambda$handleViewChange$0(ClusterTopologyManagerImpl.java:745)
> at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:144)
> at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:33)
> at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:174)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> at org.wildfly.clustering.service.concurrent.ClassLoaderThreadFactory.lambda$newThread$0(ClassLoaderThreadFactory.java:47)
> ... 1 more
> Caused by: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> ... 1 more
> [CIRCULAR REFERENCE:java.util.concurrent.ExecutionException: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1]
> 09:27:18,530 WARN [org.infinispan.topology.ClusterTopologyManagerImpl] (transport-thread--p16-t4) ISPN000197: Error updating cluster member list: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_131]
> Suppressed: java.util.concurrent.ExecutionException: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) [rt.jar:1.8.0_131]
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) [rt.jar:1.8.0_131]
> at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:82)
> at org.infinispan.remoting.transport.Transport.invokeRemotely(Transport.java:71)
> at org.infinispan.topology.ClusterTopologyManagerImpl.confirmMembersAvailable(ClusterTopologyManagerImpl.java:540)
> at org.infinispan.topology.ClusterTopologyManagerImpl.updateCacheMembers(ClusterTopologyManagerImpl.java:523)
> at org.infinispan.topology.ClusterTopologyManagerImpl.handleClusterView(ClusterTopologyManagerImpl.java:334)
> at org.infinispan.topology.ClusterTopologyManagerImpl.access$500(ClusterTopologyManagerImpl.java:85)
> at org.infinispan.topology.ClusterTopologyManagerImpl$ClusterViewListener.lambda$handleViewChange$0(ClusterTopologyManagerImpl.java:745)
> at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:144)
> at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:33)
> at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:174)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> at org.wildfly.clustering.service.concurrent.ClassLoaderThreadFactory.lambda$newThread$0(ClassLoaderThreadFactory.java:47)
> ... 1 more
> Caused by: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1
> at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [rt.jar:1.8.0_131]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
> ... 1 more
> [CIRCULAR REFERENCE:java.util.concurrent.ExecutionException: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 4 from node-1]
> {code}
> There is default JGroups udp stack configured which is used by Infinispan. Both of the servers (jgroups udp) are bound to 127.0.0.1. Node 2 has port offset 1000.
> Attaching thread dump from node 1 when it hangs during start.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 9 months
[JBoss JIRA] (WFLY-10091) GC logging is not working on Java 9
by Radovan Stancel (JIRA)
[ https://issues.jboss.org/browse/WFLY-10091?page=com.atlassian.jira.plugin... ]
Radovan Stancel reassigned WFLY-10091:
--------------------------------------
Assignee: Radovan Stancel
> GC logging is not working on Java 9
> -----------------------------------
>
> Key: WFLY-10091
> URL: https://issues.jboss.org/browse/WFLY-10091
> Project: WildFly
> Issue Type: Bug
> Components: Scripts
> Reporter: Rostislav Svoboda
> Assignee: Radovan Stancel
> Priority: Blocker
> Labels: blocker-WF14
>
> GC logging is not working on Java 9. This is related to the fact G1 GC is default and there are changes in configuration option.
> {code}
> GC_LOG="true" ./standalone.sh
> {code}
> Java 8:
> {code}
> JAVA_OPTS: -server -verbose:gc -Xloggc:"/Users/rsvoboda/TESTING/7.2.0.CD12.CR1/jboss-eap-7.2/standalone/log/gc.log"
> -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5
> -XX:GCLogFileSize=3M -XX:-TraceClassUnloading -Xms1303m -Xmx1303m -XX:MetaspaceSize=96M
> -XX:MaxMetaspaceSize=256m -Djava.net.preferIPv4Stack=true -Djboss.modules.system.pkgs=org.jboss.byteman -Djava.awt.headless=true
> {code}
> Java 9:
> {code}
> JAVA_OPTS: -server -Xms1303m -Xmx1303m -XX:MetaspaceSize=96M -XX:MaxMetaspaceSize=256m
> -Djava.net.preferIPv4Stack=true -Djboss.modules.system.pkgs=org.jboss.byteman -Djava.awt.headless=true
> {code}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 9 months
[JBoss JIRA] (WFCORE-3980) Changing multicast address in socket binding does not trigger server to set reload-required
by Paul Ferraro (JIRA)
Paul Ferraro created WFCORE-3980:
------------------------------------
Summary: Changing multicast address in socket binding does not trigger server to set reload-required
Key: WFCORE-3980
URL: https://issues.jboss.org/browse/WFCORE-3980
Project: WildFly Core
Issue Type: Bug
Components: Management
Affects Versions: 6.0.0.Alpha3
Reporter: Paul Ferraro
Assignee: Jeff Mesnil
Found this by accident when debugging a clustering test case.
If I start a server with profile standalone-full-ha and then modify the multicast address of a clustering related socket-binding, in this case, jgroups-mping, the server is not put into state reload-required.
This causes a lot of problems in the testsuite which depends on configuring servers after startup and then restarting if necessary to pull in the changes.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 9 months
[JBoss JIRA] (WFLY-10748) Changing multicast address in socket binding does not trigger server to set reload-required
by Paul Ferraro (JIRA)
[ https://issues.jboss.org/browse/WFLY-10748?page=com.atlassian.jira.plugin... ]
Paul Ferraro moved WFCORE-3980 to WFLY-10748:
---------------------------------------------
Project: WildFly (was: WildFly Core)
Key: WFLY-10748 (was: WFCORE-3980)
Component/s: Clustering
(was: Management)
Affects Version/s: 13.0.0.Final
(was: 6.0.0.Alpha3)
> Changing multicast address in socket binding does not trigger server to set reload-required
> -------------------------------------------------------------------------------------------
>
> Key: WFLY-10748
> URL: https://issues.jboss.org/browse/WFLY-10748
> Project: WildFly
> Issue Type: Bug
> Components: Clustering
> Affects Versions: 13.0.0.Final
> Reporter: Paul Ferraro
> Assignee: Jeff Mesnil
>
> Found this by accident when debugging a clustering test case.
> If I start a server with profile standalone-full-ha and then modify the multicast address of a clustering related socket-binding, in this case, jgroups-mping, the server is not put into state reload-required.
> This causes a lot of problems in the testsuite which depends on configuring servers after startup and then restarting if necessary to pull in the changes.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 9 months
[JBoss JIRA] (WFLY-10748) Changing multicast address in JGroups socket-binding does not trigger server to set reload-required
by Paul Ferraro (JIRA)
[ https://issues.jboss.org/browse/WFLY-10748?page=com.atlassian.jira.plugin... ]
Paul Ferraro updated WFLY-10748:
--------------------------------
Summary: Changing multicast address in JGroups socket-binding does not trigger server to set reload-required (was: Changing multicast address in socket binding does not trigger server to set reload-required)
> Changing multicast address in JGroups socket-binding does not trigger server to set reload-required
> ---------------------------------------------------------------------------------------------------
>
> Key: WFLY-10748
> URL: https://issues.jboss.org/browse/WFLY-10748
> Project: WildFly
> Issue Type: Bug
> Components: Clustering
> Affects Versions: 13.0.0.Final
> Reporter: Paul Ferraro
> Assignee: Jeff Mesnil
>
> Found this by accident when debugging a clustering test case.
> If I start a server with profile standalone-full-ha and then modify the multicast address of a clustering related socket-binding, in this case, jgroups-mping, the server is not put into state reload-required.
> This causes a lot of problems in the testsuite which depends on configuring servers after startup and then restarting if necessary to pull in the changes.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 9 months
[JBoss JIRA] (WFCORE-3977) Changing multicast address in socket binding does not trigger server to set reload-required
by Paul Ferraro (JIRA)
[ https://issues.jboss.org/browse/WFCORE-3977?page=com.atlassian.jira.plugi... ]
Paul Ferraro commented on WFCORE-3977:
--------------------------------------
[~brian.stansberry] No, the SocketBinding will be UP when the TransportConfiguration is UP. That's not the issue. The issue is that our protocol configuration objects extract the address/port from the socket binding. When sockets are eventually created, they are done so without any knowledge of the socket-binding from which the address/port was extracted.
So really, the fundamental issue is that our JGroups SocketFactory creates sockets via the "unnamed" variants of the SocketBindingManager. Since SocketBinding.isBound() relies on being able to lookup a socket binding in the manager by name, it will always return false for sockets created by JGroups.
It makes me wonder how many subsystem are actually doing this correctly. How many subsystems are actually creating sockets using the SocketBinding.createSocket(...) methods or using the named variants of the SocketBindingManager?
That said, JGroups SocketFactory _does_ use the concept of named sockets - however, its uses its own naming convention for the socket names. Thus we need to construct the SocketFactory with a Map<String, String> of JGroups socket names -> socket-binding names. That way, the corresponding SocketBindingManager methods are called with the correct socket-binding names.
So, in the end, this turns out to be a long standing JGroups subsystem bug. :)
> Changing multicast address in socket binding does not trigger server to set reload-required
> -------------------------------------------------------------------------------------------
>
> Key: WFCORE-3977
> URL: https://issues.jboss.org/browse/WFCORE-3977
> Project: WildFly Core
> Issue Type: Bug
> Components: Management
> Affects Versions: 6.0.0.Alpha3
> Reporter: Richard Achmatowicz
> Assignee: Jeff Mesnil
>
> Found this by accident when debugging a clustering test case.
> If I start a server with profile standalone-full-ha and then modify the multicast address of a clustering related socket-binding, in this case, jgroups-mping, the server is not put into state reload-required.
> This causes a lot of problems in the testsuite which depends on configuring servers after startup and then restarting if necessary to pull in the changes.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 9 months
[JBoss JIRA] (WFLY-10735) Unable to disable iiop port (non-ssl)
by Tomasz Adamski (JIRA)
[ https://issues.jboss.org/browse/WFLY-10735?page=com.atlassian.jira.plugin... ]
Tomasz Adamski closed WFLY-10735.
---------------------------------
Fix Version/s: 14.0.0.CR1
Resolution: Done
This issue was fixed and is working correctly in WF14.
> Unable to disable iiop port (non-ssl)
> -------------------------------------
>
> Key: WFLY-10735
> URL: https://issues.jboss.org/browse/WFLY-10735
> Project: WildFly
> Issue Type: Bug
> Affects Versions: 13.0.0.Final
> Reporter: Derek Horton
> Assignee: Tomasz Adamski
> Fix For: 14.0.0.CR1
>
>
> Wildfly 13.0.0.Final
> Removing the "iiop" socket-binding from the orb configuration:
> <subsystem xmlns="urn:jboss:domain:iiop-openjdk:2.0">
> <orb ssl-socket-binding="iiop-ssl"/>
> <initializers security="identity" transactions="spec"/>
> <security server-requires-ssl="true" client-requires-ssl="true" support-ssl="true" security-domain="jsse"/>
> </subsystem>
> and commenting out the iiop socket-binding leads to the following error:
> 2018-07-24 14:47:29,315 ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) WFLYCTL0013: Operation ("add") failed - address: ([("subsystem" => "iiop-openjdk")]) - failure description: {
> "WFLYCTL0412: Required services that are not installed:" => ["jboss.binding.iiop"],
> "WFLYCTL0180: Services with missing/unavailable dependencies" => ["jboss.iiop-openjdk.orb-service is missing [jboss.binding.iiop]"]
> }
> 2018-07-24 14:47:29,351 INFO [org.jboss.as.controller] (Controller Boot Thread) WFLYCTL0183: Service status report
> WFLYCTL0184: New missing/unsatisfied dependencies:
> service jboss.binding.iiop (missing) dependents: [service jboss.iiop-openjdk.orb-service]
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 9 months