[infinispan-dev] Eventual consistency

Sanne Grinovero sanne.grinovero at gmail.com
Thu Mar 3 09:23:45 EST 2011


2011/3/3 Manik Surtani <msurtani at redhat.com>:
> A GET may produce incorrect results if a node not involved in a rehash, Node X, asks for a key. E.g., it may ask Node C for the entry since Node C is a new owner. However Node C may not have completed applying state received so would return a null. In normal circumstances, this would be considered a valid response but if X is aware that a rehash is going on it would wait for more responses (from A and B).

I'd say that because C is aware that it's performing a rehash, and
it's aware he still doesn't know the correct value for the key it is
being requested, it should not return null but wait to know the proper
answer and return that - ideally it could give some hints to the
ongoing state transfer to prioritize this specific key as there is
immediate demand for it.
In any case, C is going to download this value soon as it's now an
owner of it, so this doesn't look like to me that this approach would
augment the network traffic.

Otherwise, how could a client requesting a key know if the returned
value is null or not? Should I deal with it in the application?

>
> As for node failure before an async rpc completes, this would result in data loss.
>
> Sent from my mobile phone
>
> On 2 Mar 2011, at 18:38, Sanne Grinovero <sanne.grinovero at gmail.com> wrote:
>
>> Hi Manik,
>> can you explain the first cause, why is it that during a rehash you're
>> unable to get an answer to a GET?
>> If a node A having installed the view T'' is receiving a GET request
>> from B which is still having the outdated view T', while he's not the
>> owner any more as the new view changed to T'' and so he just
>> transferred the requested value to a node C,
>> he definitely knows how to handle the request by forwarding it to C: B
>> was the owner before - otherwise it wouldn't receive the request, and
>> because it isn't any more he must be aware of the new hash
>> configuration.
>> He still stays in the middle of communication, and then sends the
>> requested value to A along with enough information about the new view
>> to avoid more erroneous requests.
>>
>> About your proposal, what would happen if the owner crashes before he
>> has async-written the changes to a secondary node?
>>
>> Cheers,
>> Sanne
>>
>> 2011/3/2 Manik Surtani <manik at jboss.org>:
>>> As consistency models go, Infinispan is primarily strongly consistent (with
>>> 2-phase commit between data owners), with the exception of during a rehash
>>> where because of eventual consistency (inability to get a valid response to
>>> a remote GET) forces us to wait for more responses, a quorum if you like.
>>>  Not dissimilar to PAXOS [1] in some ways.
>>> I'm wondering whether, for the sake of performance, we should also offer a
>>> fully eventually consistent model?  What I am thinking is that changes
>>> *always* occur only on the primary data owner.  Single phase, no additional
>>> round trips, etc.  The primary owner then asynchronously propagates changes
>>> to the other data owners.  This would mean things run much faster in a
>>> stable cluster, and durability is maintained.  However, during rehashes when
>>> keys are moved, the notion of the primary owner may change.  So to deal with
>>> this, we could use vector clocks [2] to version each entry.  Vector clocks
>>> allow us to "merge" state nicely in most cases, and in the case of reads,
>>> we'd flip back to a PAXOS style quorum during a rehash to get the most
>>> "correct" version.
>>> In terms of implementation, almost all of this would only affect the
>>> DistributionInterceptor and the DistributionManager, so we could easily have
>>> eventually consistent flavours of these two components.
>>> Thoughts?
>>> Cheers
>>> Manik
>>> [1] http://en.wikipedia.org/wiki/Paxos_algorithm
>>> [2] http://en.wikipedia.org/wiki/Vector_clock
>>> --
>>> Manik Surtani
>>> manik at jboss.org
>>> twitter.com/maniksurtani
>>> Lead, Infinispan
>>> http://www.infinispan.org
>>>
>>>
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>
>



More information about the infinispan-dev mailing list