[infinispan-dev] design for cluster events (wiki page)

Bela Ban bban at redhat.com
Thu Oct 31 09:25:18 EDT 2013



On 10/31/13 1:23 PM, Mircea Markus wrote:
>
>> On 31 Oct 2013, at 07:18, Bela Ban <bban at redhat.com> wrote:
>>
>>
>>
>>> On 10/30/13 8:28 PM, William Burns wrote: Since it seems I can't
>>> comment on the wiki itself, I am just replying here.
>>>
>>> I wonder if the third option 'Primary partition' is desirable.
>>> I think availability in some cases would be harmed more than we
>>> would like.
>>>
>>> Lets say you have a 5 node cluster where 3 of the nodes are
>>> behind the same router and the remaining 2 are behind a different
>>> one.  If the router crashes, power loss etc. for the 3 and are no
>>> longer addressable you have your 2 partitions (possibly 1 or even
>>> 4).  When this occurs the other 2 nodes would go into read only
>>> mode since they lost the quorum check.
>>
>> Yes, this is intended. Actually, the minority partition {D,E} might
>> even become totally inaccessible, ie. rejecting *all* requests
>> (also reads).
>>
>> This is in line with the Primary Partition approach where a
>> majority partition is allowed to make progress, and all minority
>> partitions shut down. In terms of CAP, we're sacrificing
>> availabilty here in favor of consistency.
>>
>>> But the 3 nodes that are "writable" can't be accessed any longer
>>> and thus no writes can be performed on the cluster.
>>
>> You mean some clients cannot access {A,B,C} ? Sure, then so be it,
>> but at least we don't have any inconsistent state. Again, PP is
>> *one* tool we give to th user to handle partitions.
>>
>>> It seems we would still want to allow writes to provide as high
>>> of availability as possible.
>>
>> PP is *not* about availability, it is about consistency.
>
> I think it's about availability as well, as the primary partition is still available.

Note that with a Primary Partition approach, *no* partition might be the 
primary partition and thus availablity would be impacted.

> And about consistency: the fact that PP is available
> doesn't mean it contains all the data in the original cluster(Unless
> we only allow PP iff the PP holds at least a reference to any pice of
> data in the original cluster.)
>
>> Good for some apps, bad for others. If you pick PP, you lose
>> availability.
>>
>>> Also if we did have read only, what criteria would cause those
>>> nodes to be writeable again?
>>
>> Once you become the primary partition, e.g. when a view is
>> received where view.size() >= N where N is a predefined threshold.
>> Can be different, as long as it is deterministic.
>>
>>> There is no guarantee when the other nodes will ever come back up
>>> or if there will ever be additional ones anytime soon.
>>
>> If a system picks the Primary Partition approach, then it can
>> become completely inaccessible (read-only). In this case, I
>> envisage that a sysadmin will be notified, who can then start
>> additional nodes for the system to acquire primary partition and
>> become accessible again.
>>
>> -- Bela Ban, JGroups lead (http://www.jgroups.org)
>> _______________________________________________ infinispan-dev
>> mailing list infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> _______________________________________________ infinispan-dev
> mailing list infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>

-- 
Bela Ban, JGroups lead (http://www.jgroups.org)


More information about the infinispan-dev mailing list