On Nov 1, 2013, at 7:25 AM, Bela Ban <bban(a)redhat.com> wrote:
On 10/31/13 11:20 PM, Sanne Grinovero wrote:
> On 31 October 2013 20:07, Mircea Markus <mmarkus(a)redhat.com> wrote:
>>
>> On Oct 31, 2013, at 3:45 PM, Dennis Reed <dereed(a)redhat.com> wrote:
>>
>>> On 10/31/2013 02:18 AM, Bela Ban wrote:
>>>>
>>>>> Also if we did have read only, what criteria would cause those nodes
>>>>> to be writeable again?
>>>> Once you become the primary partition, e.g. when a view is received
>>>> where view.size() >= N where N is a predefined threshold. Can be
>>>> different, as long as it is deterministic.
>>>>
>>>>> There is no guarantee when the other nodes
>>>>> will ever come back up or if there will ever be additional ones
anytime soon.
>>>> If a system picks the Primary Partition approach, then it can become
>>>> completely inaccessible (read-only). In this case, I envisage that a
>>>> sysadmin will be notified, who can then start additional nodes for the
>>>> system to acquire primary partition and become accessible again.
>>>
>>> There should be a way to manually modify the primary partition status.
>>> So if the admin knows the nodes will never return, they can manually
>>> enable the partition.
>>
>> The status will be exposed through JMX at any point, disregarding if there's
a split brain going on or not.
>>
>>>
>>> Also, the PartitionContext should know whether the nodes left normally
>>> or not.
>>> If you have 5 nodes in a cluster, and you shut down 3 of them, you'll
>>> want the remaining two to remain available.
>>> But if there was a network partition, you wouldn't. So it needs to know
>>> the difference.
>>
>> very good point again.
>> Thank you Dennis!
>
> Let's clarify. If 3 nodes out of 5 are killed without a
> reconfiguration, you do NOT want the remaining two to remain available
> unless explicitly told so by an admin. It is not possible to
> automatically make a distinction between 3 nodes being shut down vs. 3
> crashed nodes.
We could determine that a node left *gracefully* by sending an RPC
before leaving. But for all other cases, we don't know whether a node
got partitioned away, or whether it crashed.
For the graceful-leave case, we could say that we can go below the
read-only threshold to remain available. This would increase overall
availability a bit.
I think the user should decide that by providing through its configured
PartitionHandlingStrategy. The more info we provide to the user (e.g. was a clear shutdown
or not) the better decision he can make.
Cheers,
--
Mircea Markus
Infinispan lead (
www.infinispan.org)