[infinispan-issues] [JBoss JIRA] (ISPN-4484) Outbound transfers can be cancelled by old CANCEL_STATE_TRANSFER command

Dan Berindei (JIRA) issues at jboss.org
Fri Jul 4 10:43:26 EDT 2014


Dan Berindei created ISPN-4484:
----------------------------------

             Summary: Outbound transfers can be cancelled by old CANCEL_STATE_TRANSFER command
                 Key: ISPN-4484
                 URL: https://issues.jboss.org/browse/ISPN-4484
             Project: Infinispan
          Issue Type: Bug
      Security Level: Public (Everyone can see)
          Components: Core, State Transfer
    Affects Versions: 6.0.2.Final
            Reporter: Dan Berindei
            Assignee: Dan Berindei
            Priority: Critical
             Fix For: 7.0.0.Alpha5


This appeared during the 32-nodes elasticity test in the Hyperion environment.

Just as apex947 left, it started a rebalance, which apex948 dutifully cancelled as it became the new coordinator. apex949 had already requested segments from apex959, so it sent a StateRequestCommand(CANCEL_STATE_TRANSFER) asynchronously to apex959. Then apex948 started a new rebalance, and apex949 asked apex959 for the same segments. When apex959 finally received the cancel request, it didn't check the topology id and it incorrectly cancelled the outbound transfer to apex949.

The solution would be to verify the topology id in the CANCEL_STATE_TRANSFER command before cancelling the transfer. I also think we can avoid sending the cancel command completely in this case, and only send it as we are about to stop.




--
This message was sent by Atlassian JIRA
(v6.2.6#6264)


More information about the infinispan-issues mailing list