[infinispan-dev] ISPN-263 and handling partitions

Bela Ban bban at redhat.com
Sat Apr 6 03:26:29 EDT 2013



On 4/5/13 3:53 PM, Manik Surtani wrote:
> Guys,
>
> So this is what I have in mind for this, looking for opinions.
>
> 1.  We write a SplitBrainListener which is registered when the
> channel connects.  The aim of this listener is to identify when we
> have a partition.  This can be identified when a view change is
> detected, and the new view is significantly smaller than the old
> view.  Easier to detect for large clusters, but smaller clusters will
> be harder - trying to decide between a node leaving vs a partition.
> (Any better ideas here?)
>
> 2.  The SBL flips a switch in an interceptor
> (SplitBrainHandlerInterceptor?) which switches the node to be
> read-only (reject invocations that change the state of the local
> node) if it is in the smaller partition (newView.size < oldView.size
> / 2).  Only works reliably for odd-numbered cluster sizes, and the
> issues with small clusters seen in (1) will affect here as well.
>
> 3.  The SBL can flip the switch in the interceptor back to normal
> operation once a MergeView is detected.
>
> It's no way near perfect but at least it means that we can recommend
> enabling this and setting up an odd number of nodes, with a cluster
> size of at least N if you want to reduce inconsistency in your grid
> during partitions.
>
> Is this even useful?


So I assume this is to shut down (or make read-only) non primary 
partitions. I'd go with an approach similar to [1] section 5.6.2, which 
makes a partition read-only once it drops below a certain number of nodes N.


> Bela, is there a more reliable mechanism to detect a split in (1)?

I'm afraid no. We never know whether a large number of members being 
removed from the view means that they left, or that we have a partition, 
e.g. because a switch crashed.

One thing you could do though is for members who are about to leave 
regularly to broadcast a LEAVE messages, so that when the view is 
received, the SBL knows those members, and might be able to determine 
better whether we have a partition, or not.

[1] http://www.jgroups.org/manual-3.x/html/user-advanced.html, section 5.6.2

-- 
Bela Ban, JGroups lead (http://www.jgroups.org)


More information about the infinispan-dev mailing list