[infinispan-dev] Eventual consistency

Mircea Markus mircea.markus at jboss.com
Mon Mar 7 11:35:35 EST 2011


I see.
IMO we should first offer an complete (i.e. even during rehash) strongly consistent approach.

On 3 Mar 2011, at 14:05, Manik Surtani wrote:
> A GET may produce incorrect results if a node not involved in a rehash, Node X, asks for a key. E.g., it may ask Node C for the entry since Node C is a new owner. However Node C may not have completed applying state received so would return a null. In normal circumstances, this would be considered a valid response but if X is aware that a rehash is going on it would wait for more responses (from A and B). 
> 
> As for node failure before an async rpc completes, this would result in data loss. 
> 
> Sent from my mobile phone
> 
> On 2 Mar 2011, at 18:38, Sanne Grinovero <sanne.grinovero at gmail.com> wrote:
> 
>> Hi Manik,
>> can you explain the first cause, why is it that during a rehash you're
>> unable to get an answer to a GET?
>> If a node A having installed the view T'' is receiving a GET request
>> from B which is still having the outdated view T', while he's not the
>> owner any more as the new view changed to T'' and so he just
>> transferred the requested value to a node C,
>> he definitely knows how to handle the request by forwarding it to C: B
>> was the owner before - otherwise it wouldn't receive the request, and
>> because it isn't any more he must be aware of the new hash
>> configuration.
>> He still stays in the middle of communication, and then sends the
>> requested value to A along with enough information about the new view
>> to avoid more erroneous requests.
>> 
>> About your proposal, what would happen if the owner crashes before he
>> has async-written the changes to a secondary node?
>> 
>> Cheers,
>> Sanne
>> 
>> 2011/3/2 Manik Surtani <manik at jboss.org>:
>>> As consistency models go, Infinispan is primarily strongly consistent (with
>>> 2-phase commit between data owners), with the exception of during a rehash
>>> where because of eventual consistency (inability to get a valid response to
>>> a remote GET) forces us to wait for more responses, a quorum if you like.
>>> Not dissimilar to PAXOS [1] in some ways.
>>> I'm wondering whether, for the sake of performance, we should also offer a
>>> fully eventually consistent model?  What I am thinking is that changes
>>> *always* occur only on the primary data owner.  Single phase, no additional
>>> round trips, etc.  The primary owner then asynchronously propagates changes
>>> to the other data owners.  This would mean things run much faster in a
>>> stable cluster, and durability is maintained.  However, during rehashes when
>>> keys are moved, the notion of the primary owner may change.  So to deal with
>>> this, we could use vector clocks [2] to version each entry.  Vector clocks
>>> allow us to "merge" state nicely in most cases, and in the case of reads,
>>> we'd flip back to a PAXOS style quorum during a rehash to get the most
>>> "correct" version.
>>> In terms of implementation, almost all of this would only affect the
>>> DistributionInterceptor and the DistributionManager, so we could easily have
>>> eventually consistent flavours of these two components.
>>> Thoughts?
>>> Cheers
>>> Manik
>>> [1] http://en.wikipedia.org/wiki/Paxos_algorithm
>>> [2] http://en.wikipedia.org/wiki/Vector_clock
>>> --
>>> Manik Surtani
>>> manik at jboss.org
>>> twitter.com/maniksurtani
>>> Lead, Infinispan
>>> http://www.infinispan.org
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> 
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev




More information about the infinispan-dev mailing list