[infinispan-dev] Read Committed Distributed Cache Concerns

Sun Sep 22 08:22:23 EDT 2013

> On 21 Sep 2013, at 23:07, Sanne Grinovero <sanne at infinispan.org> wrote:
> 
>> On 19 September 2013 18:29, Mircea Markus <mmarkus at redhat.com> wrote:
>> (Adding Jonathan who knows a thing or two about transactions.)
>> 
>> Given that READ_COMMITTED (RC) is less performant than REPEATABLE_READ (RR)
>> I don't see any value in keeping RC around. I don't think users rely on
>> exact RC semantics (i.e. if an entry has been committed then an ongoing
>> tx requires the most up 2 date value between reads) - that actually
>> is not the case with DIST caches as you've mentioned.
> 
> I don't think you can generalize from the specific example William
> made;

William was reffering to the general case.

> there will still be cases in which READ_COMMITTED will be more
> efficient than REPEATABLE_READ,

Looking at the implementation, (also as described by William) RC is't faster than RR in the general case. Curious why you think it would be though.

> especially if you avoid "fixing" this, as suggested by Radim and
> myself in the two previous emails [not sure you if saw them, since you
> forking the conversation ignoring those mails]:
> if we agree that the current semantics is acceptable, it will
> consistently be faster than REPEATABLE_READ.

Radim's suggestion was to drop RC after running some tests to validate that RR provides the same performance. You +1 that so I don't understand why you say the conversation was forked.

> 
> Sanne
> 
>> I think RC is only preferred to RR because of performance, but if the performance
>> is the same (or even worse) I think we should only provide RR. Jonathan, care to comment?
>> 
>> 
>>> On Sep 18, 2013, at 11:03 PM, William Burns <mudokonman at gmail.com> wrote:
>>> 
>>> I was recently refactoring code dealing with isolation levels and
>>> found how ReadCommitted is implemented and I have a few concerns I
>>> wanted to bring up.
>>> 
>>> ReadCommitted read operations work by storing a reference to the value
>>> from the data store in its caller's context.  Thus whenever another
>>> transaction is committed that updates the data store value any context
>>> that has that reference now sees the latest committed value.  This
>>> works well for Local and Replicated caches since all data stores are
>>> updated with the latest value upon completion of the transaction.
>>> However Distributed caches only the owners see the update in their
>>> data store and thus any non owner will still have the old value they
>>> previously read before the commit occurred.
>>> 
>>> This seems quite inconsistent that Distributed caches run in a mix of
>>> Repeatable Read/Read Committed depending on what node and what key you
>>> are using.
>>> 
>>> To operate properly we could track requests similar to how it works
>>> for L1 so we can tell non owners to clear out their context values for
>>> values they read remotely that they haven't updated (since Read
>>> Committed writes should return the same written value).  That seems
>>> like quite a bit of additional overhead though.
>>> 
>>> I am wondering is it worth it to try to keep Read Committed isolation
>>> level though?  It seems that Repeatable Read would be simpler and most
>>> likely more performant as you wouldn't need all the additional remote
>>> calls to get it to work properly.  Or is it okay that we have
>>> different isolation levels for some keys on some nodes?  This could be
>>> quite confusing if a user was using a local and remote transaction and
>>> a transaction may not see the other's committed changes when they
>>> expect to.
>>> 
>>> What do you guys think?
>>> 
>>> - Will
>>> 
>>> P.S.
>>> 
>>> I also found a bug with Read Committed for all caches where if you do
>>> a write that changes the underlying InternalCacheEntry to a new type,
>>> that reads won't see subsequent committed values.  This is caused
>>> because the underlying data is changed to a new reference and a read
>>> would still be holding onto a reference of the old InternalCacheEntry.
>>> This can happen when using the various overridden put methods for
>>> example.  We should have a good solution for it, but may not be
>>> required if we find that Read Committed itself is flawed beyond
>>> saving.
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> Cheers,
>> --
>> Mircea Markus
>> Infinispan lead (www.infinispan.org)
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev