[infinispan-issues] [JBoss JIRA] (ISPN-2892) View installation loop when restarting cache on multiple nodes

Dennis Reed (JIRA) jira-events at lists.jboss.org
Wed Mar 6 00:39:56 EST 2013


    [ https://issues.jboss.org/browse/ISPN-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12758817#comment-12758817 ] 

Dennis Reed commented on ISPN-2892:
-----------------------------------

The problem:

CacheViewsManagerImpl#clusterInstallView calls clusterPrepareView, which calls handlePrepareView on all nodes.
It has a finally block that calls either commit or rollback, UNLESS !isRunning().

CacheViewsManagerImpl#handlePrepareView on the remote node throws "Received cache view prepare request after the local node has already shut down", but only after calling cacheViewInfo.prepareView to save the new view.

clusterInstallView catches the exception, but does not call rollback because !isRunning().

So the remote node never gets that view rolled back, and all subsequent PrepareView calls error out with "Cannot prepare new view y on cache foo, we are currently preparing view x"

                
> View installation loop when restarting cache on multiple nodes
> --------------------------------------------------------------
>
>                 Key: ISPN-2892
>                 URL: https://issues.jboss.org/browse/ISPN-2892
>             Project: Infinispan
>          Issue Type: Bug
>    Affects Versions: 5.1.7.Final
>            Reporter: Dennis Reed
>            Assignee: Mircea Markus
>
> Restarting a cache on multiple nodes at the same time can cause the following error:
> ERROR [org.infinispan.cacheviews.CacheViewsManagerImpl] (CacheViewInstaller-19,node1/web) ISPN000172: Failed to prepare view CacheView{viewId=18, members=[node2/web]} for cache default-host/test, rolling back to view CacheView{viewId=17, members=[node1/web, node2/web]}: java.util.concurrent.ExecutionException: org.infinispan.CacheException: java.lang.IllegalStateException: default-host/test: Received cache view prepare request after the local node has already shut down
> After the initial error, the following error began repeating every second for a few minutes until BaseStateTransferManagerImpl.waitForJoinToComplete() timed out and the cache failed to start:
> ERROR [org.infinispan.cacheviews.CacheViewsManagerImpl] (CacheViewInstaller-19,node1/web) ISPN000172: Failed to prepare view CacheView{viewId=21, members=[node2/web]} for cache default-host/test, rolling back to view CacheView{viewId=20, members=[]}: java.util.concurrent.ExecutionException: org.infinispan.CacheException: java.lang.IllegalStateException: Cannot prepare new view CacheView{viewId=21, members=[node2/web]} on cache default-host/test, we are currently preparing view CacheView{viewId=18, members=[node2/web]}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the infinispan-issues mailing list