[infinispan-dev] design for cluster events (wiki page)
Mircea Markus
mmarkus at redhat.com
Fri Nov 1 06:59:36 EDT 2013
On Nov 1, 2013, at 7:25 AM, Bela Ban <bban at redhat.com> wrote:
> On 10/31/13 11:20 PM, Sanne Grinovero wrote:
>> On 31 October 2013 20:07, Mircea Markus <mmarkus at redhat.com> wrote:
>>>
>>> On Oct 31, 2013, at 3:45 PM, Dennis Reed <dereed at redhat.com> wrote:
>>>
>>>> On 10/31/2013 02:18 AM, Bela Ban wrote:
>>>>>
>>>>>> Also if we did have read only, what criteria would cause those nodes
>>>>>> to be writeable again?
>>>>> Once you become the primary partition, e.g. when a view is received
>>>>> where view.size() >= N where N is a predefined threshold. Can be
>>>>> different, as long as it is deterministic.
>>>>>
>>>>>> There is no guarantee when the other nodes
>>>>>> will ever come back up or if there will ever be additional ones anytime soon.
>>>>> If a system picks the Primary Partition approach, then it can become
>>>>> completely inaccessible (read-only). In this case, I envisage that a
>>>>> sysadmin will be notified, who can then start additional nodes for the
>>>>> system to acquire primary partition and become accessible again.
>>>>
>>>> There should be a way to manually modify the primary partition status.
>>>> So if the admin knows the nodes will never return, they can manually
>>>> enable the partition.
>>>
>>> The status will be exposed through JMX at any point, disregarding if there's a split brain going on or not.
>>>
>>>>
>>>> Also, the PartitionContext should know whether the nodes left normally
>>>> or not.
>>>> If you have 5 nodes in a cluster, and you shut down 3 of them, you'll
>>>> want the remaining two to remain available.
>>>> But if there was a network partition, you wouldn't. So it needs to know
>>>> the difference.
>>>
>>> very good point again.
>>> Thank you Dennis!
>>
>> Let's clarify. If 3 nodes out of 5 are killed without a
>> reconfiguration, you do NOT want the remaining two to remain available
>> unless explicitly told so by an admin. It is not possible to
>> automatically make a distinction between 3 nodes being shut down vs. 3
>> crashed nodes.
>
>
> We could determine that a node left *gracefully* by sending an RPC
> before leaving. But for all other cases, we don't know whether a node
> got partitioned away, or whether it crashed.
>
> For the graceful-leave case, we could say that we can go below the
> read-only threshold to remain available. This would increase overall
> availability a bit.
I think the user should decide that by providing through its configured PartitionHandlingStrategy. The more info we provide to the user (e.g. was a clear shutdown or not) the better decision he can make.
Cheers,
--
Mircea Markus
Infinispan lead (www.infinispan.org)
More information about the infinispan-dev
mailing list