[infinispan-dev] Read Committed Distributed Cache Concerns

Mircea Markus mmarkus at redhat.com
Thu Sep 19 13:29:09 EDT 2013


(Adding Jonathan who knows a thing or two about transactions.)

Given that READ_COMMITTED (RC) is less performant than REPEATABLE_READ (RR) 
I don't see any value in keeping RC around. I don't think users rely on
exact RC semantics (i.e. if an entry has been committed then an ongoing 
tx requires the most up 2 date value between reads) - that actually 
is not the case with DIST caches as you've mentioned.

I think RC is only preferred to RR because of performance, but if the performance
is the same (or even worse) I think we should only provide RR. Jonathan, care to comment?


On Sep 18, 2013, at 11:03 PM, William Burns <mudokonman at gmail.com> wrote:

> I was recently refactoring code dealing with isolation levels and
> found how ReadCommitted is implemented and I have a few concerns I
> wanted to bring up.
> 
> ReadCommitted read operations work by storing a reference to the value
> from the data store in its caller's context.  Thus whenever another
> transaction is committed that updates the data store value any context
> that has that reference now sees the latest committed value.  This
> works well for Local and Replicated caches since all data stores are
> updated with the latest value upon completion of the transaction.
> However Distributed caches only the owners see the update in their
> data store and thus any non owner will still have the old value they
> previously read before the commit occurred.
> 
> This seems quite inconsistent that Distributed caches run in a mix of
> Repeatable Read/Read Committed depending on what node and what key you
> are using.
> 
> To operate properly we could track requests similar to how it works
> for L1 so we can tell non owners to clear out their context values for
> values they read remotely that they haven't updated (since Read
> Committed writes should return the same written value).  That seems
> like quite a bit of additional overhead though.
> 
> I am wondering is it worth it to try to keep Read Committed isolation
> level though?  It seems that Repeatable Read would be simpler and most
> likely more performant as you wouldn't need all the additional remote
> calls to get it to work properly.  Or is it okay that we have
> different isolation levels for some keys on some nodes?  This could be
> quite confusing if a user was using a local and remote transaction and
> a transaction may not see the other's committed changes when they
> expect to.
> 
> What do you guys think?
> 
> - Will
> 
> P.S.
> 
> I also found a bug with Read Committed for all caches where if you do
> a write that changes the underlying InternalCacheEntry to a new type,
> that reads won't see subsequent committed values.  This is caused
> because the underlying data is changed to a new reference and a read
> would still be holding onto a reference of the old InternalCacheEntry.
> This can happen when using the various overridden put methods for
> example.  We should have a good solution for it, but may not be
> required if we find that Read Committed itself is flawed beyond
> saving.
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

Cheers,
-- 
Mircea Markus
Infinispan lead (www.infinispan.org)







More information about the infinispan-dev mailing list