[infinispan-issues] [JBoss JIRA] (ISPN-2373) State transfer does not end because some segments are erroneously reported as unreceived

Wednesday, 24 October 2012

    [
https://issues.jboss.org/browse/ISPN-2373?page=com.atlassian.jira.plugin....
] 

Adrian Nistor commented on ISPN-2373:
-------------------------------------

It appears that ClusterTopologyManager.triggerRebalance submits an async task that also
submits another async task and then waits for it to complete. If the second task gets
queued because the first one used the last available thread in the pool and all other
threads in the pool are busy for the same reason we get a deadlock situation. This will
unblock when one of the tasks is aborted due to TimeoutException (usually after 4
minutes). 

To fix this we need a separate executor service that does not have a queue for all
rebalance related async tasks. It's RejectedExecutionHandler should be
CallerRunsPolicy.

...
 State transfer does not end because some segments are erroneously
reported as unreceived
 ----------------------------------------------------------------------------------------

                 Key: ISPN-2373
                 URL: https://issues.jboss.org/browse/ISPN-2373
             Project: Infinispan
          Issue Type: Feature Request
          Components: State transfer
    Affects Versions: 5.2.0.Beta1
            Reporter: Adrian Nistor
            Assignee: Adrian Nistor
            Priority: Critical
             Fix For: 5.2.0.CR1

 Hard to reproduce. I lost the last log where this was visible but still have a stack
trace:
 org.infinispan.CacheException: Unable to invoke method public void
org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete()
throws java.lang.InterruptedException on object of type StateTransferManagerImpl
         at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:205)
         at
org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:879)
         at
org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:650)
         at
org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:639)
         at
org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:542)
         at org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:197)
         at org.infinispan.CacheImpl.start(CacheImpl.java:517)
         at
org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:689)
         at
org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:652)
         at
org.infinispan.manager.DefaultCacheManager.access$100(DefaultCacheManager.java:126)
         at
org.infinispan.manager.DefaultCacheManager$1.run(DefaultCacheManager.java:574)
 Caused by: org.infinispan.CacheException: Initial state transfer timed out for cache
LuceneIndexesMetadata on PersistentStateTransferQueryDistributedIndexTest-NodeC-6067
         at
org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete(StateTransferManagerImpl.java:199)
         at sun.reflect.GeneratedMethodAccessor139.invoke(Unknown Source)
         at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
         at java.lang.reflect.Method.invoke(Method.java:597)
         at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:203)
         ... 10 more 
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

[infinispan-issues] [JBoss JIRA] (ISPN-2373) State transfer does not end because some segments are erroneously reported as unreceived