[infinispan-dev] Read Committed Distributed Cache Concerns
Mircea Markus
mmarkus at redhat.com
Sun Sep 22 08:22:23 EDT 2013
> On 21 Sep 2013, at 23:07, Sanne Grinovero <sanne at infinispan.org> wrote:
>
>> On 19 September 2013 18:29, Mircea Markus <mmarkus at redhat.com> wrote:
>> (Adding Jonathan who knows a thing or two about transactions.)
>>
>> Given that READ_COMMITTED (RC) is less performant than REPEATABLE_READ (RR)
>> I don't see any value in keeping RC around. I don't think users rely on
>> exact RC semantics (i.e. if an entry has been committed then an ongoing
>> tx requires the most up 2 date value between reads) - that actually
>> is not the case with DIST caches as you've mentioned.
>
> I don't think you can generalize from the specific example William
> made;
William was reffering to the general case.
> there will still be cases in which READ_COMMITTED will be more
> efficient than REPEATABLE_READ,
Looking at the implementation, (also as described by William) RC is't faster than RR in the general case. Curious why you think it would be though.
> especially if you avoid "fixing" this, as suggested by Radim and
> myself in the two previous emails [not sure you if saw them, since you
> forking the conversation ignoring those mails]:
> if we agree that the current semantics is acceptable, it will
> consistently be faster than REPEATABLE_READ.
Radim's suggestion was to drop RC after running some tests to validate that RR provides the same performance. You +1 that so I don't understand why you say the conversation was forked.
>
> Sanne
>
>> I think RC is only preferred to RR because of performance, but if the performance
>> is the same (or even worse) I think we should only provide RR. Jonathan, care to comment?
>>
>>
>>> On Sep 18, 2013, at 11:03 PM, William Burns <mudokonman at gmail.com> wrote:
>>>
>>> I was recently refactoring code dealing with isolation levels and
>>> found how ReadCommitted is implemented and I have a few concerns I
>>> wanted to bring up.
>>>
>>> ReadCommitted read operations work by storing a reference to the value
>>> from the data store in its caller's context. Thus whenever another
>>> transaction is committed that updates the data store value any context
>>> that has that reference now sees the latest committed value. This
>>> works well for Local and Replicated caches since all data stores are
>>> updated with the latest value upon completion of the transaction.
>>> However Distributed caches only the owners see the update in their
>>> data store and thus any non owner will still have the old value they
>>> previously read before the commit occurred.
>>>
>>> This seems quite inconsistent that Distributed caches run in a mix of
>>> Repeatable Read/Read Committed depending on what node and what key you
>>> are using.
>>>
>>> To operate properly we could track requests similar to how it works
>>> for L1 so we can tell non owners to clear out their context values for
>>> values they read remotely that they haven't updated (since Read
>>> Committed writes should return the same written value). That seems
>>> like quite a bit of additional overhead though.
>>>
>>> I am wondering is it worth it to try to keep Read Committed isolation
>>> level though? It seems that Repeatable Read would be simpler and most
>>> likely more performant as you wouldn't need all the additional remote
>>> calls to get it to work properly. Or is it okay that we have
>>> different isolation levels for some keys on some nodes? This could be
>>> quite confusing if a user was using a local and remote transaction and
>>> a transaction may not see the other's committed changes when they
>>> expect to.
>>>
>>> What do you guys think?
>>>
>>> - Will
>>>
>>> P.S.
>>>
>>> I also found a bug with Read Committed for all caches where if you do
>>> a write that changes the underlying InternalCacheEntry to a new type,
>>> that reads won't see subsequent committed values. This is caused
>>> because the underlying data is changed to a new reference and a read
>>> would still be holding onto a reference of the old InternalCacheEntry.
>>> This can happen when using the various overridden put methods for
>>> example. We should have a good solution for it, but may not be
>>> required if we find that Read Committed itself is flawed beyond
>>> saving.
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>> Cheers,
>> --
>> Mircea Markus
>> Infinispan lead (www.infinispan.org)
>>
>>
>>
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
More information about the infinispan-dev
mailing list