]
RH Bugzilla Integration updated ISPN-4490:
------------------------------------------
Bugzilla Update: Perform
Bugzilla References:
Members can miss the rebalance cancellation on coordinator change
-----------------------------------------------------------------
Key: ISPN-4490
URL:
https://issues.jboss.org/browse/ISPN-4490
Project: Infinispan
Issue Type: Bug
Security Level: Public(Everyone can see)
Components: Core, State Transfer
Affects Versions: 7.0.0.Alpha4
Reporter: Dan Berindei
Assignee: Dan Berindei
Fix For: 7.0.0.Alpha5
The new coordinator sends first a CH_UPDATE command to cancel the existing rebalance, and
then a REBALANCE_START command to start a new rebalance. But the CH_UPDATE command is sent
asynchronously, so it's possible for some members to receive it after the
REBALANCE_START command.
If that happens, that node will assume that it will receive the segments it requested for
the previous rebalance. But with the ISPN-4484 fix, the provider node cancels the outbound
transfer tasks when receiving a CH_UPDATE without a pendingCH, so the state requestor will
never receive its segments.
Even without the ISPN-4484 fix this is a problem, although less obvious. Between the
provider node receiving the CH_UPDATE and the REBALANCE_START commands, it won't have
the requestor in its write CH, so the requestor can miss transactions.