[infinispan-dev] ISPN-263 and handling partitions

Manik Surtani msurtani at redhat.com
Tue Apr 16 15:05:05 EDT 2013


Guys - I've started documenting this here [1] and will put together a prototype this week.

One question though, perhaps one for Dan/Adrian - is there any special handling for state transfer if a MergeView is detected?

- M

[1] https://community.jboss.org/wiki/DesignDealingWithNetworkPartitions

On 6 Apr 2013, at 04:26, Bela Ban <bban at redhat.com> wrote:

> 
> 
> On 4/5/13 3:53 PM, Manik Surtani wrote:
>> Guys,
>> 
>> So this is what I have in mind for this, looking for opinions.
>> 
>> 1.  We write a SplitBrainListener which is registered when the
>> channel connects.  The aim of this listener is to identify when we
>> have a partition.  This can be identified when a view change is
>> detected, and the new view is significantly smaller than the old
>> view.  Easier to detect for large clusters, but smaller clusters will
>> be harder - trying to decide between a node leaving vs a partition.
>> (Any better ideas here?)
>> 
>> 2.  The SBL flips a switch in an interceptor
>> (SplitBrainHandlerInterceptor?) which switches the node to be
>> read-only (reject invocations that change the state of the local
>> node) if it is in the smaller partition (newView.size < oldView.size
>> / 2).  Only works reliably for odd-numbered cluster sizes, and the
>> issues with small clusters seen in (1) will affect here as well.
>> 
>> 3.  The SBL can flip the switch in the interceptor back to normal
>> operation once a MergeView is detected.
>> 
>> It's no way near perfect but at least it means that we can recommend
>> enabling this and setting up an odd number of nodes, with a cluster
>> size of at least N if you want to reduce inconsistency in your grid
>> during partitions.
>> 
>> Is this even useful?
> 
> 
> So I assume this is to shut down (or make read-only) non primary 
> partitions. I'd go with an approach similar to [1] section 5.6.2, which 
> makes a partition read-only once it drops below a certain number of nodes N.
> 
> 
>> Bela, is there a more reliable mechanism to detect a split in (1)?
> 
> I'm afraid no. We never know whether a large number of members being 
> removed from the view means that they left, or that we have a partition, 
> e.g. because a switch crashed.
> 
> One thing you could do though is for members who are about to leave 
> regularly to broadcast a LEAVE messages, so that when the view is 
> received, the SBL knows those members, and might be able to determine 
> better whether we have a partition, or not.
> 
> [1] http://www.jgroups.org/manual-3.x/html/user-advanced.html, section 5.6.2
> 
> -- 
> Bela Ban, JGroups lead (http://www.jgroups.org)
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Manik Surtani
manik at jboss.org
twitter.com/maniksurtani

Platform Architect, JBoss Data Grid
http://red.ht/data-grid




More information about the infinispan-dev mailing list