[infinispan-issues] [JBoss JIRA] (ISPN-2316) Distributed deadlock in StateTransferInterceptor

Radim Vansa (JIRA) jira-events at lists.jboss.org
Wed Sep 19 05:10:35 EDT 2012


    [ https://issues.jboss.org/browse/ISPN-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719765#comment-12719765 ] 

Radim Vansa commented on ISPN-2316:
-----------------------------------

I have updated the description as I have finally found the lock (adding thread writing all thread's stacks periodically to trace can be pretty useful).
                
> Distributed deadlock in StateTransferInterceptor
> ------------------------------------------------
>
>                 Key: ISPN-2316
>                 URL: https://issues.jboss.org/browse/ISPN-2316
>             Project: Infinispan
>          Issue Type: Feature Request
>          Components: State transfer, Transactions
>    Affects Versions: 5.2.0.Alpha3
>            Reporter: Radim Vansa
>            Assignee: Mircea Markus
>
> When using transactions, a distributed deadlock may occur when a node is joining under these circumstances:
> 1) the new node requests transactions using GET_TRANSACTIONS
> 2) the old node tries to commit a transaction, broadcasting PrepareCommand - in StateTransferIntreceptor it locks the transactionLock in shared way
> 3) the request GET_TRANSACTIONS comes on the new node, the node is waiting for the transactionLock (it requires it exclusively)
> 4) transaction commit on new node is waiting for the commandsLock (requires this in shared way) but it is locked exclusively by the onTopologyUpdate - addTransfer - requestTransactions ( = synchronous GET_TRANSACTIONS).
> Found in some traces, but not required:
> After the transaction commit times out on old node releasing the lock, the GET_TRANSACTION request may continue, but the state transfer itself can also timeout if not set properly longer.
> The transaction commit continues on the new node after the ST times out, until it is found invalid (rolled back).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the infinispan-issues mailing list