[infinispan-issues] [JBoss JIRA] (ISPN-2892) View installation loop when restarting cache on multiple nodes
Dennis Reed (JIRA)
jira-events at lists.jboss.org
Wed Mar 6 12:14:57 EST 2013
[ https://issues.jboss.org/browse/ISPN-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12759121#comment-12759121 ]
Dennis Reed commented on ISPN-2892:
-----------------------------------
I would consider this critical for EAP 6.0.1, as it appears to be triggered by the very common use case of redeploying in EAP domain mode.
I believe it is just an individual cache being restarted. There is one CacheManager for the web subsystem, with one cache per application.
One application is being restarted across the domain, while others are still deployed, so the CacheManager itself stays running.
> It's also odd that the remote node would throw a "Received cache view prepare request after the local node has already shut down" exception while it's joining
It has to be getting the new view (which is [node2] because the coordinator node1 is leaving) in between the end of stop() and the first of start() since that's the only time that the listener is null.
> but if it's a race condition I believe waiting for a few seconds between stop and start should work around the problem.
Based on my analysis, I also believe this workaround would work.
Unfortunately, this is not that simple to do for this particular use case (although I believe that it *is* possible using a deployment plan to avoid restarting in parallel across the nodes).
> View installation loop when restarting cache on multiple nodes
> --------------------------------------------------------------
>
> Key: ISPN-2892
> URL: https://issues.jboss.org/browse/ISPN-2892
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 5.1.7.Final
> Reporter: Dennis Reed
> Assignee: Mircea Markus
>
> Restarting a cache on multiple nodes at the same time can cause the following error:
> ERROR [org.infinispan.cacheviews.CacheViewsManagerImpl] (CacheViewInstaller-19,node1/web) ISPN000172: Failed to prepare view CacheView{viewId=18, members=[node2/web]} for cache default-host/test, rolling back to view CacheView{viewId=17, members=[node1/web, node2/web]}: java.util.concurrent.ExecutionException: org.infinispan.CacheException: java.lang.IllegalStateException: default-host/test: Received cache view prepare request after the local node has already shut down
> After the initial error, the following error began repeating every second for a few minutes until BaseStateTransferManagerImpl.waitForJoinToComplete() timed out and the cache failed to start:
> ERROR [org.infinispan.cacheviews.CacheViewsManagerImpl] (CacheViewInstaller-19,node1/web) ISPN000172: Failed to prepare view CacheView{viewId=21, members=[node2/web]} for cache default-host/test, rolling back to view CacheView{viewId=20, members=[]}: java.util.concurrent.ExecutionException: org.infinispan.CacheException: java.lang.IllegalStateException: Cannot prepare new view CacheView{viewId=21, members=[node2/web]} on cache default-host/test, we are currently preparing view CacheView{viewId=18, members=[node2/web]}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the infinispan-issues
mailing list