I see.
IMO we should first offer an complete (i.e. even during rehash) strongly consistent
approach.
On 3 Mar 2011, at 14:05, Manik Surtani wrote:
A GET may produce incorrect results if a node not involved in a
rehash, Node X, asks for a key. E.g., it may ask Node C for the entry since Node C is a
new owner. However Node C may not have completed applying state received so would return a
null. In normal circumstances, this would be considered a valid response but if X is aware
that a rehash is going on it would wait for more responses (from A and B).
As for node failure before an async rpc completes, this would result in data loss.
Sent from my mobile phone
On 2 Mar 2011, at 18:38, Sanne Grinovero <sanne.grinovero(a)gmail.com> wrote:
> Hi Manik,
> can you explain the first cause, why is it that during a rehash you're
> unable to get an answer to a GET?
> If a node A having installed the view T'' is receiving a GET request
> from B which is still having the outdated view T', while he's not the
> owner any more as the new view changed to T'' and so he just
> transferred the requested value to a node C,
> he definitely knows how to handle the request by forwarding it to C: B
> was the owner before - otherwise it wouldn't receive the request, and
> because it isn't any more he must be aware of the new hash
> configuration.
> He still stays in the middle of communication, and then sends the
> requested value to A along with enough information about the new view
> to avoid more erroneous requests.
>
> About your proposal, what would happen if the owner crashes before he
> has async-written the changes to a secondary node?
>
> Cheers,
> Sanne
>
> 2011/3/2 Manik Surtani <manik(a)jboss.org>:
>> As consistency models go, Infinispan is primarily strongly consistent (with
>> 2-phase commit between data owners), with the exception of during a rehash
>> where because of eventual consistency (inability to get a valid response to
>> a remote GET) forces us to wait for more responses, a quorum if you like.
>> Not dissimilar to PAXOS [1] in some ways.
>> I'm wondering whether, for the sake of performance, we should also offer a
>> fully eventually consistent model? What I am thinking is that changes
>> *always* occur only on the primary data owner. Single phase, no additional
>> round trips, etc. The primary owner then asynchronously propagates changes
>> to the other data owners. This would mean things run much faster in a
>> stable cluster, and durability is maintained. However, during rehashes when
>> keys are moved, the notion of the primary owner may change. So to deal with
>> this, we could use vector clocks [2] to version each entry. Vector clocks
>> allow us to "merge" state nicely in most cases, and in the case of
reads,
>> we'd flip back to a PAXOS style quorum during a rehash to get the most
>> "correct" version.
>> In terms of implementation, almost all of this would only affect the
>> DistributionInterceptor and the DistributionManager, so we could easily have
>> eventually consistent flavours of these two components.
>> Thoughts?
>> Cheers
>> Manik
>> [1]
http://en.wikipedia.org/wiki/Paxos_algorithm
>> [2]
http://en.wikipedia.org/wiki/Vector_clock
>> --
>> Manik Surtani
>> manik(a)jboss.org
>>
twitter.com/maniksurtani
>> Lead, Infinispan
>>
http://www.infinispan.org
>>
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev(a)lists.jboss.org
>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev