[infinispan-issues] [JBoss JIRA] (ISPN-4949) Split brain: inconsistent data after merge

Tuesday, 9 December 2014

    [
https://issues.jboss.org/browse/ISPN-4949?page=com.atlassian.jira.plugin....
] 

Bela Ban commented on ISPN-4949:
--------------------------------

Do you have a timeout waiting for all ACKs ?
E.g what do you do in the following case: ?
* View ABCD, and D crashes
* You send a prepare(ABC) to ABC
* C enters heavy GC activity and isn't able to reply before the timeout elapses
** However, C is *not* suspected (the GC cycle is too short for FD, but too long for the
prepare RPC) !
* So A gets ACKs from itself and B, but not from C
* A doesn't install the view (at the ISPN level)
* There is no subsequent view from JGroups; the view remains ABC

How's this case handled ?

...
 Split brain: inconsistent data after merge
 ------------------------------------------

                 Key: ISPN-4949
                 URL: https://issues.jboss.org/browse/ISPN-4949
             Project: Infinispan
          Issue Type: Bug
          Components: State Transfer
    Affects Versions: 7.0.0.Final
            Reporter: Radim Vansa
            Assignee: Dan Berindei
            Priority: Critical
             Fix For: 7.1.0.Alpha1

 1) cluster A, B, C, D splits into 2 parts:
 A, B (coord A) finds this out immediately and enters degraded mode with CH [A, B, C, D]
 C, D (coord D) first detects that B is lost, gets view A, C, D and starts rebalance with
CH [A, C, D]. Segment X is primary owned by C (it had backup on B but this got lost)
 2) D detects that A was lost as well, therefore enters degraded mode with CH [A, C, D]
 3) C inserts entry into X: all owners (only C) is present, therefore the modification is
allowed
 4) cluster is merged and coordinator finds out that the max stable topology has CH [A, B,
C, D] (it is the older of the two partitions' topologies, got from A, B) - logs
'No active or unavailable partitions, so all the partitions must be in degraded
mode' (yes, all partitions are in degraded mode, but write has happened in the
meantime)
 5) The old CH is broadcast in newest topology, no rebalance happens
 6) Inconsistency: read in X may miss the update 

--
This message was sent by Atlassian JIRA
(v6.3.8#6338)

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

[infinispan-issues] [JBoss JIRA] (ISPN-4949) Split brain: inconsistent data after merge