On Oct 31, 2013, at 8:34 AM, Radim Vansa <rvansa(a)redhat.com> wrote:
On 10/30/2013 08:46 PM, Mircea Markus wrote:
> On Oct 30, 2013, at 7:28 PM, William Burns <mudokonman(a)gmail.com> wrote:
>
>> Since it seems I can't comment on the wiki itself, I am just replying here.
>>
>> I wonder if the third option 'Primary partition' is desirable. I
>> think availability in some cases would be harmed more than we would
>> like.
>>
>> Lets say you have a 5 node cluster where 3 of the nodes are behind the
>> same router and the remaining 2 are behind a different one. If the
>> router crashes, power loss etc. for the 3 and are no longer
>> addressable you have your 2 partitions (possibly 1 or even 4). When
>> this occurs the other 2 nodes would go into read only mode since they
>> lost the quorum check.
> agreed.
>
>> But the 3 nodes that are "writable" can't be
>> accessed any longer and thus no writes can be performed on the
>> cluster. It seems we would still want to allow writes to provide as
>> high of availability as possible.
> we actually don't take the decision for the user but to plug in his own
PartitionHandlingStrategy to make a wiser decision based on their network specifics.
> The quorum approach written there is just a suggestion, I'll make that clearer.
>
>> Also if we did have read only, what criteria would cause those nodes
>> to be writeable again?
> Changing the availability status is possible through JMX, so either manual
intervention or some MergeListeners that do that automatically.
You should probably outline the MergeListeners on wiki as well. I
believe that automatic merge is highly desirable, because for example
long GC may cause partition more likely than network failure - and you
don't want to require some monkey pushing JMX after every long GC.
good point, I've added a section describing that.
Cheers,
--
Mircea Markus
Infinispan lead (
www.infinispan.org)