[
https://issues.jboss.org/browse/ISPN-5021?page=com.atlassian.jira.plugin....
]
Radim Vansa updated ISPN-5021:
------------------------------
Description:
Copied from
[
ISPN-4444|https://issues.jboss.org/browse/ISPN-4444?focusedCommentId=1302...]
If the CH_UPDATE command is delayed on the old owner, the new owners might update the key
without the old owner knowing, and a locality check on the old owner won't help.
I remember one thing that struck me when reading the Raft algorithm was that they install
configuration changes symmetrically, in 3 phases. We might need to do the same for our
rebalance:
1. T0: read_ch=old, write_ch=old
2. start a rebalance
3. T1: read_ch=old, write_ch=old+new
4. new owners have all the data
5. T2: read_ch=new, write_ch=old+new
6. remove old cache entries and ignore further writes
7. T3: read_ch=new, write_ch=new
was:
Copied from
[
ISPN-4444|https://issues.jboss.org/browse/ISPN-4444?focusedCommentId=1302...]
If the CH_UPDATE command is delayed on the old owner, the new owners might update the key
without the old owner knowing, and a locality check on the old owner won't help.
I remember one thing that struck me when reading the Raft algorithm was that they install
configuration changes symmetrically, in 3 phases. We might need to do the same for our
rebalance: start a rebalance with read_ch=old, write_ch=old+new, when the new owners have
all the data install read_ch=new, write_ch=old+new, and finally read_ch=new, write_ch=new.
Old cache entries are removed during the 2nd topology update, and further writes should be
ignored, in order for this to work.
Nodes that finish the rebalance later can see outdated values
-------------------------------------------------------------
Key: ISPN-5021
URL:
https://issues.jboss.org/browse/ISPN-5021
Project: Infinispan
Issue Type: Bug
Components: Core, State Transfer
Affects Versions: 7.0.2.Final
Reporter: Dan Berindei
Assignee: Pedro Ruivo
Priority: Critical
Copied from
[
ISPN-4444|https://issues.jboss.org/browse/ISPN-4444?focusedCommentId=1302...]
If the CH_UPDATE command is delayed on the old owner, the new owners might update the key
without the old owner knowing, and a locality check on the old owner won't help.
I remember one thing that struck me when reading the Raft algorithm was that they install
configuration changes symmetrically, in 3 phases. We might need to do the same for our
rebalance:
1. T0: read_ch=old, write_ch=old
2. start a rebalance
3. T1: read_ch=old, write_ch=old+new
4. new owners have all the data
5. T2: read_ch=new, write_ch=old+new
6. remove old cache entries and ignore further writes
7. T3: read_ch=new, write_ch=new
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)