[JBoss JIRA] (ISPN-6599) PutAll operation in the Hot Rod client only partially completed during topology changes

Thursday, 12 May 2016



    [
https://issues.jboss.org/browse/ISPN-6599?page=com.atlassian.jira.plugin....
] 

Gustavo Fernandes edited comment on ISPN-6599 at 5/12/16 12:00 PM:
-------------------------------------------------------------------

Tried running the reproducer using the PR linked and I am having issues starting the
servers: very often they cannot start and hung with errors:
{noformat}
2016-05-12 14:33:14,598 ERROR [org.jgroups.protocols.UNICAST3] (OOB-20,server2)
JGRP000039: server2: failed to deliver OOB message [dst: server2, src: server0 (4
headers), size=151 bytes, flags=OOB|DONT_BUNDLE|NO_TOTAL_ORDER]:
java.lang.NullPointerException
...
2016-05-12 14:33:14,598 ERROR [org.jgroups.protocols.UNICAST3] (OOB-18,server2)
JGRP000039: server2: failed to deliver OOB message [dst: server2, src: server1 (4
headers), size=151 bytes, flags=OOB|DONT_BUNDLE|NO_TOTAL_ORDER]:
java.lang.NullPointerException
...
2016-05-12 14:33:20,598 ERROR [org.infinispan.CLUSTER] (transport-thread--p4-t1)
ISPN000196: Failed to recover cluster state after the current node became the coordinator
(or after merge): java.util.concurrent.ExecutionException:
org.infinispan.util.concurrent.TimeoutException: Replication timeout for server0
        at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
        at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1907)
        at
org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:75)
        at
org.infinispan.topology.ClusterTopologyManagerImpl.executeOnClusterSync(ClusterTopologyManagerImpl.java:578)
        at
org.infinispan.topology.ClusterTopologyManagerImpl.recoverClusterStatus(ClusterTopologyManagerImpl.java:448)
        at
org.infinispan.topology.ClusterTopologyManagerImpl.handleClusterView(ClusterTopologyManagerImpl.java:365)
        at
org.infinispan.topology.ClusterTopologyManagerImpl$ClusterViewListener.lambda$handleViewChange$328(ClusterTopologyManagerImpl.java:717)
        at
org.infinispan.topology.ClusterTopologyManagerImpl$ClusterViewListener$$Lambda$41/1877974992.call(Unknown
Source)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at
org.infinispan.executors.SemaphoreCompletionService$QueueingTask.runInternal(SemaphoreCompletionService.java:172)
        at
org.infinispan.executors.SemaphoreCompletionService$QueueingTask.run(SemaphoreCompletionService.java:151)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
2016-05-12 14:33:20,599 FATAL [org.infinispan.CLUSTER] (transport-thread--p4-t1)
ISPN100004: After merge (or coordinator change), the coordinator failed to recover
cluster. Cluster members are [server2, server1, server0].
{noformat}

Attached are the trace logs for all servers plus thread dump
[^start.zip]


was (Author: gustavonalle):
Tried running the reproducer using the PR linked and I am having issues starting the
servers: very often they cannot start and hung with errors:
{noformat}
2016-05-12 14:33:14,598 ERROR [org.jgroups.protocols.UNICAST3] (OOB-20,server2)
JGRP000039: server2: failed to deliver OOB message [dst: server2, src: server0 (4
headers), size=151 bytes, flags=OOB|DONT_BUNDLE|NO_TOTAL_ORDER]:
java.lang.NullPointerException
...
2016-05-12 14:33:14,598 ERROR [org.jgroups.protocols.UNICAST3] (OOB-18,server2)
JGRP000039: server2: failed to deliver OOB message [dst: server2, src: server1 (4
headers), size=151 bytes, flags=OOB|DONT_BUNDLE|NO_TOTAL_ORDER]:
java.lang.NullPointerException
...
2016-05-12 14:33:20,598 ERROR [org.infinispan.CLUSTER] (transport-thread--p4-t1)
ISPN000196: Failed to recover cluster state after the current node became the coordinator
(or after merge): java.util.concurrent.ExecutionException:
org.infinispan.util.concurrent.TimeoutException: Replication timeout for server0
        at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
        at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1907)
        at
org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:75)
        at
org.infinispan.topology.ClusterTopologyManagerImpl.executeOnClusterSync(ClusterTopologyManagerImpl.java:578)
        at
org.infinispan.topology.ClusterTopologyManagerImpl.recoverClusterStatus(ClusterTopologyManagerImpl.java:448)
        at
org.infinispan.topology.ClusterTopologyManagerImpl.handleClusterView(ClusterTopologyManagerImpl.java:365)
        at
org.infinispan.topology.ClusterTopologyManagerImpl$ClusterViewListener.lambda$handleViewChange$328(ClusterTopologyManagerImpl.java:717)
        at
org.infinispan.topology.ClusterTopologyManagerImpl$ClusterViewListener$$Lambda$41/1877974992.call(Unknown
Source)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at
org.infinispan.executors.SemaphoreCompletionService$QueueingTask.runInternal(SemaphoreCompletionService.java:172)
        at
org.infinispan.executors.SemaphoreCompletionService$QueueingTask.run(SemaphoreCompletionService.java:151)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
2016-05-12 14:33:20,599 FATAL [org.infinispan.CLUSTER] (transport-thread--p4-t1)
ISPN100004: After merge (or coordinator change), the coordinator failed to recover
cluster. Cluster members are [server2, server1, server0].
{noformat}

Attached are the trace logs for all servers plus thread dump

...
 PutAll operation in the Hot Rod client only partially completed
during topology changes
 ---------------------------------------------------------------------------------------

                 Key: ISPN-6599
                 URL: https://issues.jboss.org/browse/ISPN-6599
             Project: Infinispan
          Issue Type: Bug
          Components: Server
    Affects Versions: 9.0.0.Alpha1
            Reporter: Gustavo Fernandes
            Assignee: Dan Berindei
         Attachments: reproducer.zip, start.zip, trace.zip

 
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009