[infinispan-dev] Eventual consistency

Manik Surtani msurtani at redhat.com
Thu Mar 3 09:05:30 EST 2011


A GET may produce incorrect results if a node not involved in a rehash, Node X, asks for a key. E.g., it may ask Node C for the entry since Node C is a new owner. However Node C may not have completed applying state received so would return a null. In normal circumstances, this would be considered a valid response but if X is aware that a rehash is going on it would wait for more responses (from A and B). 

As for node failure before an async rpc completes, this would result in data loss. 

Sent from my mobile phone

On 2 Mar 2011, at 18:38, Sanne Grinovero <sanne.grinovero at gmail.com> wrote:

> Hi Manik,
> can you explain the first cause, why is it that during a rehash you're
> unable to get an answer to a GET?
> If a node A having installed the view T'' is receiving a GET request
> from B which is still having the outdated view T', while he's not the
> owner any more as the new view changed to T'' and so he just
> transferred the requested value to a node C,
> he definitely knows how to handle the request by forwarding it to C: B
> was the owner before - otherwise it wouldn't receive the request, and
> because it isn't any more he must be aware of the new hash
> configuration.
> He still stays in the middle of communication, and then sends the
> requested value to A along with enough information about the new view
> to avoid more erroneous requests.
> 
> About your proposal, what would happen if the owner crashes before he
> has async-written the changes to a secondary node?
> 
> Cheers,
> Sanne
> 
> 2011/3/2 Manik Surtani <manik at jboss.org>:
>> As consistency models go, Infinispan is primarily strongly consistent (with
>> 2-phase commit between data owners), with the exception of during a rehash
>> where because of eventual consistency (inability to get a valid response to
>> a remote GET) forces us to wait for more responses, a quorum if you like.
>>  Not dissimilar to PAXOS [1] in some ways.
>> I'm wondering whether, for the sake of performance, we should also offer a
>> fully eventually consistent model?  What I am thinking is that changes
>> *always* occur only on the primary data owner.  Single phase, no additional
>> round trips, etc.  The primary owner then asynchronously propagates changes
>> to the other data owners.  This would mean things run much faster in a
>> stable cluster, and durability is maintained.  However, during rehashes when
>> keys are moved, the notion of the primary owner may change.  So to deal with
>> this, we could use vector clocks [2] to version each entry.  Vector clocks
>> allow us to "merge" state nicely in most cases, and in the case of reads,
>> we'd flip back to a PAXOS style quorum during a rehash to get the most
>> "correct" version.
>> In terms of implementation, almost all of this would only affect the
>> DistributionInterceptor and the DistributionManager, so we could easily have
>> eventually consistent flavours of these two components.
>> Thoughts?
>> Cheers
>> Manik
>> [1] http://en.wikipedia.org/wiki/Paxos_algorithm
>> [2] http://en.wikipedia.org/wiki/Vector_clock
>> --
>> Manik Surtani
>> manik at jboss.org
>> twitter.com/maniksurtani
>> Lead, Infinispan
>> http://www.infinispan.org
>> 
>> 
>> 
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 



More information about the infinispan-dev mailing list