[infinispan-issues] [JBoss JIRA] (ISPN-2713) REBALANCE_START and REBALANCE_CONFIRM commands deadlock when RSVP.ack_on_delivery=true

Dan Berindei (JIRA) jira-events at lists.jboss.org
Wed Mar 6 11:35:56 EST 2013


    [ https://issues.jboss.org/browse/ISPN-2713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12759090#comment-12759090 ] 

Dan Berindei commented on ISPN-2713:
------------------------------------

With the fix for ISPN-2825, the thread sending the REBALANCE_START command won't hold the lock on the ClusterCacheStatus any more, and the REBALANCE_CONFIRM command will be able to proceed.
                
> REBALANCE_START and REBALANCE_CONFIRM commands deadlock when RSVP.ack_on_delivery=true
> --------------------------------------------------------------------------------------
>
>                 Key: ISPN-2713
>                 URL: https://issues.jboss.org/browse/ISPN-2713
>             Project: Infinispan
>          Issue Type: Bug
>          Components: State transfer
>    Affects Versions: 5.2.0.CR1
>            Reporter: Dan Berindei
>            Assignee: Dan Berindei
>             Fix For: 5.3.0.Final
>
>
> When the coordinator sends a REBALANCE_START command, it holds a lock on the ClusterCacheStatus until it receives the responses from all the other members.
> If a node doesn't need to request any new state, it sends the rebalance confirmation to the coordinator on the same thread that received the REBALANCE_START command. The REBALANCE_CONFIRM command also wants to acquire a lock on the ClusterCacheStatus on the coordinator, but because the REBALANCE_CONFIRM command is sent asynchronously, it doesn't deadlock with the thread waiting for REBALANCE_START responses on the coordinator.
> At least, that's what happens when {{RSVP.ack_on_delivery=false}} (the Infinispan default). When {{RSVP.ack_on_delivery=true}} (the JGroups default), the "asynchronous" REBALANCE_CONFIRM command becomes synchronous, and it generates a deadlock. The rebalance then fails after the RSVP timeout expires (10 seconds by default).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the infinispan-issues mailing list