Re: [infinispan-dev] Distribution, take 2

Monday, 20 July 2009

On 20 Jul 2009, at 16:05, Galder Zamarreno wrote:

...

 On 07/20/2009 12:36 PM, Manik Surtani wrote:
>
> On 20 Jul 2009, at 09:24, Galder Zamarreno wrote:
>
>>
>>
>> On 07/17/2009 08:24 PM, Manik Surtani wrote:
>>> 6.3. Nodes (L - replCount + 1), (L + 1) and (L + replCount) kick  
>>> off a
>>> LeaveTask.
>>
>> Shouldn't (L - replCount + 1) be (L - 1) ?
>
> Well spotted. That was leftover from an earlier version I was working
> on. It should be (L - 1).
>
>>
>>> 6.3.1. Not everyone in the cluster need be involved in this rehash
>>> thanks to fixed CH positions.
>>>
>>> 7. LeaveTask: This is PUSH based
>>> 7.1. If an existing LeaveTask is running, the existing task  
>>> should be
>>> cancelled first.
>>
>> Why should it be cancelled? In a cluster of 3 nodes, A:B:C:D,  
>> imagine
>> A leaves and while B is pushing state for A, C leaves as well. Are  
>> you
>> gonna cancel B's push?
>
> So in the case of cluster {A, B, C, D, E, F}
>
> Let's take the case of C leaving.
>
> B pushes C's state to D.
>
> This task being cancelled will only happen if A or D were to leave
> (others leaving will not trigger a LeaveTask on B)
>
> So if D leaves, it should be cancelled and restarted since there is  
> no
> point in beaming state to D. Even if A leaves, a rehash should only
> happen once. So a rehash due to A leaving should cover state being
> pushed to D as well.

 Hmmmm, good point. It's A leaving while B's pushing C's state to D  
 that I didn't have it clear.

 What about this? C is leaving, B pushes C's state to D, so D is  
 running LeaveTask as receiver. Now, E leaves. D needs to start a  
 Leave task as pusher, shouldn't it? If the data was disjoint, the  
 two Leave tasks could operate in parallel. I suppose this is the  
 concurrent leave case you're mentioning earlier. 
Yes, it may make sense to differentiate between PushLeaveTasks and  
ReceiveLeaveTasks if the data sets are not disjoint (could happen with  
replCount > 2).

Perhaps it only makes sense to cancel the LeaveTask if it pertains to  
an overlapping dataset, or if any LeaveTask intends to push state to a  
node that has just left and triggered a new LeaveTask.  Need to think  
about this a bit.

...
> While typing this I realised that points 7.4.1 and 8.2. in my  
> original
> email will prevent this from working since the conditions in those  
> steps
> do not assume that a task could be cancelled. A potential way for  
> this
> is to maintain a "leavers list". Every time a LeaveTask completes,  
> the
> LeaversList is emptied. Every time the LeaveTask starts, it adds the
> leaver's address to the LeaversList. Conditions 7.4.1 and 8.2  
> consider
> all leavers in the LeaversList, not just the current leaver, L.
>
> Thoughts?

 Hmmm, but could u update the LeaversList while 7.4.1 or 8.2 is going  
 on? That task is tied to OLD_CH and the moment there's a new leaver,  
 you have a new OLD_CH. Or would you iterate over the list of leavers  
 that and for each iterate over all keys in data container? That  
 sounds expensive. 
Well, the LeaversList would need to contain both the leaver as well as  
the CH at the time (the OLD_CH relating to that leave event).

Re: cost, I don't think so.  I'd flip it the other way around and  
iterate through keys in the data container, and for each key, decide  
whether it needs to be in a state transfer message by looping through  
all leavers in the LeaversList.

Cheers
--
Manik Surtani
manik(a)jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] Distribution, take 2