[infinispan-issues] [JBoss JIRA] (ISPN-1581) Improve resiliency of retrying commits on state transfer
Erik Salter (Commented) (JIRA)
jira-events at lists.jboss.org
Wed Nov 30 18:11:40 EST 2011
[ https://issues.jboss.org/browse/ISPN-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646945#comment-12646945 ]
Erik Salter commented on ISPN-1581:
-----------------------------------
Here's a test we ran where we kill -9 a node, then immediately restarted it. This produced many stale locks.
The node in question is dht10. Here are the full trace logs.
http://dl.dropbox.com/u/50401510/5.1.0.CR1/StaleLock/dht10/dht10.rar
http://dl.dropbox.com/u/50401510/5.1.0.CR1/StaleLock/dht11/dht11.rar
http://dl.dropbox.com/u/50401510/5.1.0.CR1/StaleLock/dht12/dht12.rar
http://dl.dropbox.com/u/50401510/5.1.0.CR1/StaleLock/dht13/dht13.rar
http://dl.dropbox.com/u/50401510/5.1.0.CR1/StaleLock/dht14/dht14.rar
http://dl.dropbox.com/u/50401510/5.1.0.CR1/StaleLock/dht15/dht15.rar
> Improve resiliency of retrying commits on state transfer
> ---------------------------------------------------------
>
> Key: ISPN-1581
> URL: https://issues.jboss.org/browse/ISPN-1581
> Project: Infinispan
> Issue Type: Enhancement
> Affects Versions: 5.1.0.CR1
> Reporter: Erik Salter
> Assignee: Manik Surtani
>
> The current implementation of ISPN-1484 will retry up to 3 times to retry a commit on a remote node. This is resilient to 3 state transfers happening in rapid succession. However, if the cluster loses > 3 nodes, there can still be stale locks.
> This is evident in testing this with the TopologyAwareConsistentHash. I lost a "site" consisting of 4 nodes, and I was able to get stale locks.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the infinispan-issues
mailing list