[infinispan-issues] [JBoss JIRA] (ISPN-1602) Single view change causes stale locks

Dan Berindei (Commented) (JIRA) jira-events at lists.jboss.org
Thu Dec 15 16:31:09 EST 2011


    [ https://issues.jboss.org/browse/ISPN-1602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12651417#comment-12651417 ] 

Dan Berindei commented on ISPN-1602:
------------------------------------

I got a similar situation in the test suite, but with 2 view changes:

{noformat}
Initial cluster view is A2 [A, B]
tx_A: sends CommitCommand cc to B
C: sends REQUEST_JOIN
A: sends PREPARE_VIEW(3, [A, B, C]) to B, C
A: blocks write commands for cache view 3
B: blocks write commands for cache view 3
tx_B: rejects cc with a StateTransferInProgress
tx_A: waits for state transfer 3 to end
A: sends COMMIT_VIEW(3) to B, C
A: unblocks transactions
tx_A: resends cc to B
tx_B: rejects cc again, because state transfer hasn't ended on B yet
tx_A: waits for state transfer 3 to end
      but since state transfer 3 has already ended on A, it doesn't actually wait at all
tx_A: resends cc to B
tx_B: rejects cc the 3rd time
B: finally receives the COMMIT_VIEW(3) command and unblocks transactions, but it's too late for cc
{noformat}

My pull req for ISPN-1581 will fix this because it will retry commands for much longer (indefinitely for commit commands). There will be some extra StateTransferInProgressExceptions in the log, but that's about it.
                
> Single view change causes stale locks
> -------------------------------------
>
>                 Key: ISPN-1602
>                 URL: https://issues.jboss.org/browse/ISPN-1602
>             Project: Infinispan
>          Issue Type: Bug
>          Components: Core API
>    Affects Versions: 5.1.0.CR1
>            Reporter: Erik Salter
>            Assignee: Dan Berindei
>            Priority: Critical
>             Fix For: 5.1.0.CR2
>
>         Attachments: erm_tcp.xml, session_udp.xml
>
>
> During load testing of 5.1.0.CR1, we're encountering JGroups 3.x dropping views.  We know due to ISPN-1581, if the number of view changes > 3, there could be a stale lock on a failed commit.  However, we're seeing stale locks occur on a single view change.
> In the following logs, the affected cluster is the erm-cluster-xxxx
> (We also don't know why JGroups 3.x is unstable.  We suspected FLUSH and incorrect FD settings, but we removed them, and we're still getting dropped messages)
> The trace logs (It isn't long at all before the issue occurs) are at:
> http://dl.dropbox.com/u/50401510/5.1.0.CR1/dec08viewchange/dht10/server.log.gz
> http://dl.dropbox.com/u/50401510/5.1.0.CR1/dec08viewchange/dht11/server.log.gz
> http://dl.dropbox.com/u/50401510/5.1.0.CR1/dec08viewchange/dht12/server.log.gz
> http://dl.dropbox.com/u/50401510/5.1.0.CR1/dec08viewchange/dht13/server.log.gz
> http://dl.dropbox.com/u/50401510/5.1.0.CR1/dec08viewchange/dht14/server.log.gz
> http://dl.dropbox.com/u/50401510/5.1.0.CR1/dec08viewchange/dht15/server.log.gz

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the infinispan-issues mailing list