[infinispan-issues] [JBoss JIRA] (ISPN-4949) Split brain: inconsistent data after merge
Dan Berindei (JIRA)
issues at jboss.org
Wed Dec 10 06:54:39 EST 2014
[ https://issues.jboss.org/browse/ISPN-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026627#comment-13026627 ]
Dan Berindei commented on ISPN-4949:
------------------------------------
To clarify, the timeout is just for the prepare RPC. If a node fails to reply in 4 minutes, we keep using the old cache topology as long as nothing else happens (e.g. a JGroups view change, or a node joining/leaving at the cache level).
I wouldn't expect any user to wait for a node to be suspected for > 1 minute, because all read/write operations involving that node will be blocked during that time, so 4 minutes should be enough to make sure we always get a new view (or an ACK).
> Split brain: inconsistent data after merge
> ------------------------------------------
>
> Key: ISPN-4949
> URL: https://issues.jboss.org/browse/ISPN-4949
> Project: Infinispan
> Issue Type: Bug
> Components: State Transfer
> Affects Versions: 7.0.0.Final
> Reporter: Radim Vansa
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 7.1.0.Alpha1
>
>
> 1) cluster A, B, C, D splits into 2 parts:
> A, B (coord A) finds this out immediately and enters degraded mode with CH [A, B, C, D]
> C, D (coord D) first detects that B is lost, gets view A, C, D and starts rebalance with CH [A, C, D]. Segment X is primary owned by C (it had backup on B but this got lost)
> 2) D detects that A was lost as well, therefore enters degraded mode with CH [A, C, D]
> 3) C inserts entry into X: all owners (only C) is present, therefore the modification is allowed
> 4) cluster is merged and coordinator finds out that the max stable topology has CH [A, B, C, D] (it is the older of the two partitions' topologies, got from A, B) - logs 'No active or unavailable partitions, so all the partitions must be in degraded mode' (yes, all partitions are in degraded mode, but write has happened in the meantime)
> 5) The old CH is broadcast in newest topology, no rebalance happens
> 6) Inconsistency: read in X may miss the update
--
This message was sent by Atlassian JIRA
(v6.3.8#6338)
More information about the infinispan-issues
mailing list