[infinispan-issues] [JBoss JIRA] (ISPN-2373) State transfer does not end because some segments are erroneously reported as unreceived

Adrian Nistor (JIRA) jira-events at lists.jboss.org
Tue Oct 23 17:18:01 EDT 2012


    [ https://issues.jboss.org/browse/ISPN-2373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12728720#comment-12728720 ] 

Adrian Nistor commented on ISPN-2373:
-------------------------------------

After adding more logging to the code and extensive log analysis it seems that indeed initial state transfer does not happen for about 4 minutes and this causes tests to fail. This explains the InterruptedException thrown from StateTransferManagerImpl.waitForInitialStateTransferToComplete(). But the root cause is not because segments were not received or were not acknowledged as received as was initially thought and described in the title. Instead I have found that rebalance does not even start for the first 4 minutes! This happens because the rebalance message is sent by a task submitted to the async executor service and during tests this pool happens to be configured with max 4 threads. This small thread pool often leads to tasks being discarded. Unfortunately in this case the exception thrown is not logged so the problem was hidden until now. To fix this I added logging that highlights the issue and have increased the pool to 6 threads. This allows the suite to always run successfully. Before this change it was usually failing randomly with the exception after just 2-3 runs.
                
> State transfer does not end because some segments are erroneously reported as unreceived
> ----------------------------------------------------------------------------------------
>
>                 Key: ISPN-2373
>                 URL: https://issues.jboss.org/browse/ISPN-2373
>             Project: Infinispan
>          Issue Type: Feature Request
>          Components: State transfer
>    Affects Versions: 5.2.0.Beta1
>            Reporter: Adrian Nistor
>            Assignee: Adrian Nistor
>            Priority: Critical
>             Fix For: 5.2.0.CR1
>
>
> Hard to reproduce. I lost the last log where this was visible but still have a stack trace:
> org.infinispan.CacheException: Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete() throws java.lang.InterruptedException on object of type StateTransferManagerImpl
>         at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:205)
>         at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:879)
>         at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:650)
>         at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:639)
>         at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:542)
>         at org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:197)
>         at org.infinispan.CacheImpl.start(CacheImpl.java:517)
>         at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:689)
>         at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:652)
>         at org.infinispan.manager.DefaultCacheManager.access$100(DefaultCacheManager.java:126)
>         at org.infinispan.manager.DefaultCacheManager$1.run(DefaultCacheManager.java:574)
> Caused by: org.infinispan.CacheException: Initial state transfer timed out for cache LuceneIndexesMetadata on PersistentStateTransferQueryDistributedIndexTest-NodeC-6067
>         at org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete(StateTransferManagerImpl.java:199)
>         at sun.reflect.GeneratedMethodAccessor139.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:203)
>         ... 10 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira



More information about the infinispan-issues mailing list