[
https://issues.jboss.org/browse/ISPN-2373?page=com.atlassian.jira.plugin....
]
Adrian Nistor commented on ISPN-2373:
-------------------------------------
It appears that ClusterTopologyManager.triggerRebalance submits an async task that also
submits another async task and then waits for it to complete. If the second task gets
queued because the first one used the last available thread in the pool and all other
threads in the pool are busy for the same reason we get a deadlock situation. This will
unblock when one of the tasks is aborted due to TimeoutException (usually after 4
minutes).
To fix this we need a separate executor service that does not have a queue for all
rebalance related async tasks. It's RejectedExecutionHandler should be
CallerRunsPolicy.
State transfer does not end because some segments are erroneously
reported as unreceived
----------------------------------------------------------------------------------------
Key: ISPN-2373
URL:
https://issues.jboss.org/browse/ISPN-2373
Project: Infinispan
Issue Type: Feature Request
Components: State transfer
Affects Versions: 5.2.0.Beta1
Reporter: Adrian Nistor
Assignee: Adrian Nistor
Priority: Critical
Fix For: 5.2.0.CR1
Hard to reproduce. I lost the last log where this was visible but still have a stack
trace:
org.infinispan.CacheException: Unable to invoke method public void
org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete()
throws java.lang.InterruptedException on object of type StateTransferManagerImpl
at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:205)
at
org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:879)
at
org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:650)
at
org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:639)
at
org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:542)
at org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:197)
at org.infinispan.CacheImpl.start(CacheImpl.java:517)
at
org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:689)
at
org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:652)
at
org.infinispan.manager.DefaultCacheManager.access$100(DefaultCacheManager.java:126)
at
org.infinispan.manager.DefaultCacheManager$1.run(DefaultCacheManager.java:574)
Caused by: org.infinispan.CacheException: Initial state transfer timed out for cache
LuceneIndexesMetadata on PersistentStateTransferQueryDistributedIndexTest-NodeC-6067
at
org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete(StateTransferManagerImpl.java:199)
at sun.reflect.GeneratedMethodAccessor139.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:203)
... 10 more
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:
http://www.atlassian.com/software/jira