[infinispan-issues] [JBoss JIRA] (ISPN-2713) REBALANCE_START and REBALANCE_CONFIRM commands deadlock when RSVP.ack_on_delivery=true

Dan Berindei (JIRA) jira-events at lists.jboss.org
Tue Jan 15 05:45:22 EST 2013


Dan Berindei created ISPN-2713:
----------------------------------

             Summary: REBALANCE_START and REBALANCE_CONFIRM commands deadlock when RSVP.ack_on_delivery=true
                 Key: ISPN-2713
                 URL: https://issues.jboss.org/browse/ISPN-2713
             Project: Infinispan
          Issue Type: Bug
          Components: State transfer
    Affects Versions: 5.2.0.CR1
            Reporter: Dan Berindei
            Assignee: Dan Berindei
             Fix For: 5.3.0.Final


When the coordinator sends a REBALANCE_START command, it holds a lock on the ClusterCacheStatus until it receives the responses from all the other members.

If a node doesn't need to request any new state, it sends the rebalance confirmation to the coordinator on the same thread that received the REBALANCE_START command. The REBALANCE_CONFIRM command also wants to acquire a lock on the ClusterCacheStatus on the coordinator, but because the REBALANCE_CONFIRM command is sent asynchronously, it doesn't deadlock with the thread waiting for REBALANCE_START responses on the coordinator.

At least, that's what happens when {{RSVP.ack_on_delivery=false}} (the Infinispan default). When {{RSVP.ack_on_delivery=true}} (the JGroups default), the "asynchronous" REBALANCE_CONFIRM command becomes synchronous, and it generates a deadlock. The rebalance then fails after the RSVP timeout expires (10 seconds by default).


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the infinispan-issues mailing list