[infinispan-issues] [JBoss JIRA] (ISPN-4949) Split brain: inconsistent data after merge

Sat Nov 15 02:49:29 EST 2014

    [ https://issues.jboss.org/browse/ISPN-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020304#comment-13020304 ] 

Bela Ban commented on ISPN-4949:
--------------------------------

Here's a possible algorithm similar to the one Dan proposed above that might work for what you want to do in Infinispan:
* View is V5=A,B,C,D
* C and D partition away (or crash)
* A gets view V6=A,B,C
* A creates an AckCollection(A,B,C) for V6 and sends a PREPARE_REBALANCE(V6) to A,B,C
* A receives ACK(V6) from A and B and removes them from the AckCollection
* Since the AckCollection is not empty (C is still in there):
** A doesn't do anything
* When ACK(V6) from C is received: send COMMIT_REBALANCE(V6) to A,B,C
* When a new view V7=A,B is received:
** A sends a PREPARE_REBALANCE(V7) to A,B
** The causes all previous pending PREPARE_REBALANCE to be removed
** Continue as above
* During waiting for all ACKs for a given PREPARE_REBALANCE, the RPC timeouts (if
  sync) are reduced, so RPCs return sooner and don't block on crashed members, or
  members on the other side of a partition


> Split brain: inconsistent data after merge
> ------------------------------------------
>
>                 Key: ISPN-4949
>                 URL: https://issues.jboss.org/browse/ISPN-4949
>             Project: Infinispan
>          Issue Type: Bug
>          Components: State Transfer
>    Affects Versions: 7.0.0.Final
>            Reporter: Radim Vansa
>            Priority: Critical
>
> 1) cluster A, B, C, D splits into 2 parts:
> A, B (coord A) finds this out immediately and enters degraded mode with CH [A, B, C, D]
> C, D (coord D) first detects that B is lost, gets view A, C, D and starts rebalance with CH [A, C, D]. Segment X is primary owned by C (it had backup on B but this got lost)
> 2) D detects that A was lost as well, therefore enters degraded mode with CH [A, C, D]
> 3) C inserts entry into X: all owners (only C) is present, therefore the modification is allowed
> 4) cluster is merged and coordinator finds out that the max stable topology has CH [A, B, C, D] (it is the older of the two partitions' topologies, got from A, B) - logs 'No active or unavailable partitions, so all the partitions must be in degraded mode' (yes, all partitions are in degraded mode, but write has happened in the meantime)
> 5) The old CH is broadcast in newest topology, no rebalance happens
> 6) Inconsistency: read in X may miss the update


--
This message was sent by Atlassian JIRA
(v6.3.8#6338)