[infinispan-issues] [JBoss JIRA] Commented: (ISPN-765) Concurrent startup leading to rehashing issues

Tuesday, 30 November 2010

    [
https://jira.jboss.org/browse/ISPN-765?page=com.atlassian.jira.plugin.sys...
] 

Mircea Markus commented on ISPN-765:
------------------------------------

Looking at the attached logs, there is a cache named "volatileCache" in there
which is not defined in infinispan.xml. I don't think the supplied configuration file
(infinispan.xml) was used to produce these logs. Also the jgroups file was couldn't
work with 4.2.CR3 (see my last note). I'm confused.

...
 Concurrent startup leading to rehashing issues
 ----------------------------------------------

                 Key: ISPN-765
                 URL: https://jira.jboss.org/browse/ISPN-765
             Project: Infinispan
          Issue Type: Bug
          Components: Distributed Cache
    Affects Versions: 4.2.0.BETA1
            Reporter: Galder Zamarreño
            Assignee: Mircea Markus
            Priority: Critical
             Fix For: 4.2.0.CR3, 4.2.0.Final

         Attachments: coordinator.log, infinispan-jgroups.xml, infinispan.xml,
infinispan_test.log, slave1.log, slave2.log

 From ISPN-762, there seems to be a rehashing problem when several distributed caches are
started concurrently:
 Logs can be found in:
https://jira.jboss.org/secure/attachment/12338527/topologycache.diff
 "Even though all the caches are defined in the right place, there is still a problem
during starting 3 nodes at a time. Look at the errors coming from the log files: 
 * wp-60814 log: 
 2010-11-08 14:29:26,687 INFO [org.infinispan.distribution.DistributionManagerImpl]
(Incoming-2,wp-60814) This is a JOIN event! Wait for notification from new joiner wp-59660

 2010-11-08 14:31:26,668 ERROR [org.infinispan.distribution.JoinTask] (Rehasher-wp-60814)
Caught exception! 
 org.infinispan.CacheException: org.infinispan.util.concurrent.TimeoutException: Timed out
after 120 seconds waiting for a response from wp-34493 
         at
org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommands(CommandAwareRpcDispatcher.java:122)

         at
org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:403)

         at
org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:101) 
         at
org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:125) 
         at org.infinispan.distribution.JoinTask.retrieveOldCH(JoinTask.java:192) 
         at org.infinispan.distribution.JoinTask.performRehash(JoinTask.java:87) 
         at org.infinispan.distribution.RehashTask.call(RehashTask.java:53) 
         at org.infinispan.distribution.RehashTask.call(RehashTask.java:33) 
         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) 
         at java.util.concurrent.FutureTask.run(FutureTask.java:138) 
         at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) 
         at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) 
         at java.lang.Thread.run(Thread.java:619) 
 Caused by: org.infinispan.util.concurrent.TimeoutException: Timed out after 120 seconds
waiting for a response from wp-34493 
         at
org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher$ReplicationTask.call(CommandAwareRpcDispatcher.java:304)

         at
org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommands(CommandAwareRpcDispatcher.java:120)

         ... 12 more 
 * wp-34493 log: 
 2010-11-08 14:29:26,654 INFO [org.infinispan.distribution.DistributionManagerImpl]
(Incoming-2,wp-34493) This is a JOIN event! Wait for notification from 
 new joiner wp-59660 
 2010-11-08 14:29:36,759 ERROR [org.infinispan.interceptors.InvocationContextInterceptor]
(InfinispanServer-Main) Execution error: 
 org.infinispan.util.concurrent.TimeoutException: Timed out waiting for valid responses! 
         at
org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:421)

         at
org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:101) 
 ... 
         at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:273)

         at org.infinispan.CacheDelegate.putIfAbsent(CacheDelegate.java:453) 
         at org.infinispan.CacheSupport.putIfAbsent(CacheSupport.java:40) 
         at
org.infinispan.server.hotrod.HotRodServer$$anonfun$1.apply(HotRodServer.scala:85) 
         at
org.infinispan.server.hotrod.HotRodServer$$anonfun$1.apply(HotRodServer.scala:75) 
         at
org.infinispan.server.hotrod.HotRodServer.isViewUpdated(HotRodServer.scala:102) 
         at
org.infinispan.server.hotrod.HotRodServer.org$infinispan$server$hotrod$HotRodServer$$updateTopologyView(HotRodServer.scala:97)

         at
org.infinispan.server.hotrod.HotRodServer.addSelfToTopologyView(HotRodServer.scala:75) 
         at
org.infinispan.server.hotrod.HotRodServer.startTransport(HotRodServer.scala:63) 
         at
org.infinispan.server.core.AbstractProtocolServer.start(AbstractProtocolServer.scala:70) 
         at org.infinispan.server.hotrod.HotRodServer.start(HotRodServer.scala:44) 
 ... 
 * wp-59660 log: 
 2010-11-08 14:29:26,699 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
(InfinispanServer-Main) Received new cluster view: [wp-34493|2] [wp-34493, wp-60814,
wp-59660] 
 2010-11-08 14:29:26,756 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
(InfinispanServer-Main) Cache local address is wp-59660, physical addresses are
[10.0.36.136:34302] 
 2010-11-08 14:29:26,851 ERROR [org.infinispan.remoting.InboundInvocationHandlerImpl]
(OOB-2,wp-59660) Defined caches: [___hotRodTopologyCache] 
 2010-11-08 14:29:26,851 INFO [org.infinispan.remoting.InboundInvocationHandlerImpl]
(OOB-2,wp-59660) Will try and wait for the cache to start 
 2010-11-08 14:29:56,864 INFO [org.infinispan.remoting.InboundInvocationHandlerImpl]
(OOB-2,wp-59660) Cache named ___hotRodTopologyCache does not exist on this cache manager!

 2010-11-08 14:49:26,897 ERROR [org.infinispan.distribution.JoinTask] (Rehasher-wp-59660)
Caught exception! 
 org.infinispan.CacheException: Unable to retrieve old consistent hash from coordinator
even after several attempts at sleeping and retrying! 
         at org.infinispan.distribution.JoinTask.retrieveOldCH(JoinTask.java:218) 
         at org.infinispan.distribution.JoinTask.performRehash(JoinTask.java:87) 
         at org.infinispan.distribution.RehashTask.call(RehashTask.java:53) 
         at org.infinispan.distribution.RehashTask.call(RehashTask.java:33) 
         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) 
         at java.util.concurrent.FutureTask.run(FutureTask.java:138) 
         at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) 
         at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) 
         at java.lang.Thread.run(Thread.java:619) 
 The first node reaches rehashRpcTimeout=120000 while waiting on the old hash. The second
one reaches 10s timeout during adding himself to the topology view. And the third one
fails on retrieving old hash, but after much longer time = 20min. 
 It looks as though we had a kind of deadlock." 
-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

[infinispan-issues] [JBoss JIRA] Commented: (ISPN-765) Concurrent startup leading to rehashing issues