[infinispan-issues] [JBoss JIRA] Assigned: (ISPN-1182) Failure after TimeoutException during the restart of HotRod Server

Tue Jun 14 10:32:29 EDT 2011

     [ https://issues.jboss.org/browse/ISPN-1182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Manik Surtani reassigned ISPN-1182:
-----------------------------------

    Assignee: Galder Zamarreño  (was: Manik Surtani)


> Failure after TimeoutException during the restart of HotRod Server
> ------------------------------------------------------------------
>
>                 Key: ISPN-1182
>                 URL: https://issues.jboss.org/browse/ISPN-1182
>             Project: Infinispan
>          Issue Type: Bug
>          Components: Cache Server
>    Affects Versions: 5.0.0.CR4
>            Reporter: Jacek Gerbszt
>            Assignee: Galder Zamarreño
>             Fix For: 5.0.0.CR6
>
>         Attachments: hotrodexception.txt
>
>
> Sometimes during restart of 3 or more HotRod nodes from 25-node cluster, I receive replication timeout exception, after which the node is unusable. 
> The timeout comes from replacing the view in HotrodServer.addSelfToTopologyView. If 3 nodes try to replace the same element in cache at the same time, it's not a big surprise, that they fall into some kind of deadlock, which is properly recognized and broken after the timeout. But unfortunately the breaking exception is not handled and stops the HotRodServer start procedure. I suggest to catch it in addSelfToTopologyView like this:
> 	    var updated = false
>             try {
>                 updated = topologyCache.replace("view", currentView, newView)
>             } catch {
>                 case e: TimeoutException => logUnableToReplaceView
>             }
> This time the exception will not be thrown from the containing closure and updateTopologyView method will have the chance to replace the view again.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira