[infinispan-issues] [JBoss JIRA] (ISPN-4949) Split brain: inconsistent data after merge

Mon Nov 10 07:49:30 EST 2014

    [ https://issues.jboss.org/browse/ISPN-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018601#comment-13018601 ] 

Bela Ban commented on ISPN-4949:
--------------------------------

Re consensus based view installation: JGroups doesn't use consensus for view installation, because it's simply faster to do without consensus, which includes round trips at a latency and a timeout. For large clusters, this won't be scalable. Imagine you need to install a new view in 500 nodes. This would include invoking ~ 500 RPCs, wait for all responses (or a timeout) and then do a second RPC committing the proposed view. Additional logic would be needed to handle dangling prepare and commit, ie. when the leader crashes after the PREPARE or COMMIT phase, plus possible vote collection when a new coord takes over.
I'll look into this though, so I created [1] as a result.

[1] JGRP-1901

> Split brain: inconsistent data after merge
> ------------------------------------------
>
>                 Key: ISPN-4949
>                 URL: https://issues.jboss.org/browse/ISPN-4949
>             Project: Infinispan
>          Issue Type: Bug
>          Components: State Transfer
>    Affects Versions: 7.0.0.Final
>            Reporter: Radim Vansa
>            Priority: Critical
>
> 1) cluster A, B, C, D splits into 2 parts:
> A, B (coord A) finds this out immediately and enters degraded mode with CH [A, B, C, D]
> C, D (coord D) first detects that B is lost, gets view A, C, D and starts rebalance with CH [A, C, D]. Segment X is primary owned by C (it had backup on B but this got lost)
> 2) D detects that A was lost as well, therefore enters degraded mode with CH [A, C, D]
> 3) C inserts entry into X: all owners (only C) is present, therefore the modification is allowed
> 4) cluster is merged and coordinator finds out that the max stable topology has CH [A, B, C, D] (it is the older of the two partitions' topologies, got from A, B) - logs 'No active or unavailable partitions, so all the partitions must be in degraded mode' (yes, all partitions are in degraded mode, but write has happened in the meantime)
> 5) The old CH is broadcast in newest topology, no rebalance happens
> 6) Inconsistency: read in X may miss the update

--
This message was sent by Atlassian JIRA
(v6.3.8#6338)