[infinispan-dev] design for cluster events (wiki page)

Bela Ban bban at redhat.com
Fri Nov 1 03:10:57 EDT 2013



On 10/31/13 4:45 PM, Dennis Reed wrote:
> On 10/31/2013 02:18 AM, Bela Ban wrote:
>>
>>> Also if we did have read only, what criteria would cause those nodes
>>> to be writeable again?
>> Once you become the primary partition, e.g. when a view is received
>> where view.size() >= N where N is a predefined threshold. Can be
>> different, as long as it is deterministic.
>>
>>> There is no guarantee when the other nodes
>>> will ever come back up or if there will ever be additional ones anytime soon.
>> If a system picks the Primary Partition approach, then it can become
>> completely inaccessible (read-only). In this case, I envisage that a
>> sysadmin will be notified, who can then start additional nodes for the
>> system to acquire primary partition and become accessible again.
>
> There should be a way to manually modify the primary partition status.
> So if the admin knows the nodes will never return, they can manually
> enable the partition.
>
> Also, the PartitionContext should know whether the nodes left normally
> or not.
> If you have 5 nodes in a cluster, and you shut down 3 of them, you'll
> want the remaining two to remain available.
> But if there was a network partition, you wouldn't.  So it needs to know
> the difference.

JGroups won't tell you, and I don't want to add a flag to each member of 
a view telling you whether it was a graceful or crash-leave.

However, you (Infinispan) could send a LEAVE message shortly before 
leaving, which is stored by everyone (or only the coord?). When the view 
is received, we should be able to determine who left and who crashed.

-- 
Bela Ban, JGroups lead (http://www.jgroups.org)


More information about the infinispan-dev mailing list