[infinispan-issues] [JBoss JIRA] (ISPN-1519) Stale locks on adding a new node to a cluster
Dan Berindei (Commented) (JIRA)
jira-events at lists.jboss.org
Sun Nov 20 05:35:40 EST 2011
[ https://issues.jboss.org/browse/ISPN-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644191#comment-12644191 ]
Dan Berindei commented on ISPN-1519:
------------------------------------
It looks like my fix for ISPN-1484 doesn't cover the case where the commit is async. As a workaround you can change the configuration so that syncCommitPhase = true.
A proper fix for the async commit scenario would involve retrying the commit on the remote node. In the sync commit scenario we must keep the retry logic on the originator, otherwise the commit command on the originator blocks the state transfer from finishing.
Doing both is a bit tricky at the moment, because even with syncCommitPhase = true the commit RPC could still be sent synchronously (if the owners changed between prepare and commit) and the remote nodes don't know if the RPC is synchronous or not.
> Stale locks on adding a new node to a cluster
> ---------------------------------------------
>
> Key: ISPN-1519
> URL: https://issues.jboss.org/browse/ISPN-1519
> Project: Infinispan
> Issue Type: Bug
> Components: Locking and Concurrency
> Affects Versions: 5.0.1.FINAL
> Reporter: Erik Salter
> Assignee: Dan Berindei
>
> There is an issue with stale locks when a new node joins a running cluster.
> There are three caches all using eagerLockSingleNode=true with the transaction running on the primary data owner. The local locks for the affected keys are acquired, and the prepare is sent to the backup owner. During this time, the new node is detected and joins the cluster. The transaction times out waiting for the transaction lock, and a rollback is attempted. Whereas the local keys are unlocked, the remotely-acquired locks never release.
> I have full trace log files at:
> http://dl.dropbox.com/u/10929737/5.0.1-stale-lock/data-grid-4/server.log.2011-11-09.log.31.gz
> http://dl.dropbox.com/u/10929737/5.0.1-stale-lock/data-grid-4/server.log.2011-11-09.log.32.gz
> http://dl.dropbox.com/u/10929737/5.0.1-stale-lock/data-grid-5/server.log.2011-11-09.log.rar
> The transaction in question is GlobalTransaction:<data-grid-4-61247>:169901, found on data-grid-4. I have verified that the primary owner of the keys in question are on data-grid-4.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the infinispan-issues
mailing list