[hibernate-dev] ReadWrite 2LC "read uncommitted" data anomaly

Wed Apr 6 07:02:39 EDT 2016

Hi Vlad,

I think that this change is safe (it uses only the supported flow of 2LC 
SPI calls), though it introduces a bit more overhead for the refresh.

FYI, besides unit tests, there's CorrectnessTestCase [1] - I've used 
this one to find out most of flaws in Infinispan 2LC impl. Could be 
worth a try running this on EHCache, too.

I've noticed some of the added tests explicitly use H2. Hibernate uses 
1.3 for the testsuite - I was running tests with both 1.3 and 1.4 and 
the behaviour can differ a lot, since 1.4 can lock separate rows and 1.3 
locks just whole tables (pardon me if I recall that incorrectly) - this 
can lead to different timing, and eventually to deadlock/test failure. 
Just to let you know that I had problems with these.

Radim

[1] 
https://github.com/hibernate/hibernate-orm/blob/master/hibernate-infinispan/src/test/java/org/hibernate/test/cache/infinispan/stress/CorrectnessTestCase.java

On 04/06/2016 12:17 PM, Vlad Mihalcea wrote:
> Hi Radim,
>
> I pushed this fix on master and 5.1, and I managed to add an EHCache 
> where the same behavior was replicated as well:
>
> https://github.com/hibernate/hibernate-orm/commit/cbdab9d87f05b4255c7930a32fe995f87f0f3e0b
>
> For Infinispan, I think it's better if you can investigate if this is 
> an issue or if the change does not break anything either (all 
> Infinispan tests ran fine, so hopefully this should not be the case).
>
> Thanks,
> Vlad
>
> On Wed, Apr 6, 2016 at 12:58 PM, Radim Vansa <rvansa at redhat.com 
> <mailto:rvansa at redhat.com>> wrote:
>
>     On 04/05/2016 04:13 PM, Vlad Mihalcea wrote:
>     > Hi,
>     >
>     > I'd definitely fix it for the refresh operation, which does an
>     implicit
>     > cache eviction too.
>     > In this case, the proposed PR solution is fine since it simply
>     locks the
>     > entry right after it is evicted from the cache and releases the
>     lock after
>     > the transaction is ended.
>     > This way, we won't push an uncommitted entity into 2PL during
>     the two-phase
>     > loading phase that is triggered by the refresh operation.
>     >
>     > For sessionFactory.getCache.evictEntity/evictCollection, if
>     there is a
>     > current Session going on, we could propagate a
>     > CacheEvictEvent/CollectionCacheEvictEvent which can apply the
>     lock on that
>     > particular entity/collection, and we release it right after the
>     current
>     > transaction is ended, similar to what refresh should do as well.
>     >
>     > For every other use case, like evictAll/evictRegion, we should just
>     > document the behavior.
>     >
>     > I saw that Radim has added such a warning for Infinispan in the
>     new User
>     > Guide:
>     >
>     > read-write mode is supported on non-transactional
>     distributed/replicated
>     >> caches, however, eviction should not be used in this
>     configuration. Use of
>     >> eviction can lead to consistency issues.
>
>     This is a different matter; in Infinispan 2LC impl you store locks and
>     entries either in two different caches (the entries cache is
>     invalidation, locks is local), or in single cache
>     (replicated/distributed). As we don't want to lose locks randomly, and
>     eviction picks entries unpredictably, its use is discouraged.
>
>     I think that this issue does not apply to Infinispan with the
>     invalidation configuration, since evict/evictAll does not remove any
>     locks, and the lock blocks further updates (including putFromLoads) to
>     the entry in cache until the transaction commits. In case of
>     replicated/distributed cache, it seems that the evict is ignored after
>     update, but evictAll is not (that would qualify as a bug) - so after
>     evictAll you could observe the uncommitted read. Nevertheless I would
>     have to test this.
>
>     Radim
>
>     >
>     > Unfortunately, the EhCache documentation
>     >
>     <http://www.ehcache.org/documentation/2.8/integrations/hibernate.html#read-write>
>     >   makes a wrong assumption:
>     >
>     > read-write - Caches data that is sometimes updated while
>     maintaining the
>     >> semantics of “read committed” isolation level.
>     >
>     > Vlad
>     >
>     > On Tue, Apr 5, 2016 at 4:30 PM, Sanne Grinovero
>     <sanne at hibernate.org <mailto:sanne at hibernate.org>> wrote:
>     >
>     >> On 5 April 2016 at 14:11, Vlad Mihalcea
>     <mihalcea.vlad at gmail.com <mailto:mihalcea.vlad at gmail.com>> wrote:
>     >>> Hi,
>     >>>
>     >>> While reviewing the PR for this issue:
>     >>>
>     >>> https://hibernate.atlassian.net/browse/HHH-10649
>     >>>
>     >>> I realized that the ReadWrite cache concurrency strategy has a
>     flaw that
>     >>> permits "read uncommitted" anomalies.
>     >>> The RW cache concurrency strategy guards any modifications
>     with Lock
>     >>> entries, as explained in this post that I wrote some time ago:
>     >>>
>     >>>
>     >>
>     http://vladmihalcea.com/2015/05/25/how-does-hibernate-read_write-cacheconcurrencystrategy-work/
>     >>> Every time we update/delete an entry, a Lock is put in the
>     cache under
>     >> the
>     >>> entity key, and, this way, "read uncommitted" anomalies should be
>     >> prevented.
>     >>> The problem comes when entries are evicted either explicitly:
>     >>>
>     >>> session.getSessionFactory().getCache().evictEntity(
>     CacheableItem.class,
>     >>> item1.getId() );
>     >>>
>     >>> or implicitly:
>     >>>
>     >>> session.refresh( item1 );
>     >> Good catch!
>     >>
>     >> I think this is caused as we generally don't expect the evict
>     >> operation to be controlled explicitly.
>     >> In my personal experience, I would use the evictAll method to
>     nuke the
>     >> cache state after some significant operation, like restoring a
>     >> backup.. and no other Session would have been opened in the
>     meantime.
>     >> I never used an explicit single-shot evict so I can't say what
>     the use
>     >> case would be.
>     >>
>     >> But of course you're right that it might be used differently, or at
>     >> least such a limitation should be clear.
>     >>
>     >>> During eviction, the 2PL will remove the Lock entry, and if
>     the user
>     >>> attempts to load the entity anew (in the same transaction that
>     has modified
>     >>> the entity but which is not committed yet), an uncommitted
>     change could be
>     >>> propagated to the 2PL.
>     >>>
>     >>> This issue is replicated by the PR associated to this Jira
>     issue, and I
>     >>> also replicated it with manual eviction and entity loading.
>     >>>
>     >>> To fix it, the RW cache concurrency strategy should not delete
>     entries from
>     >>> 2PL upon eviction, but instead it should turn them in Lock
>     entries.
>     >> I'm not sure I understood this part. Shouldn't it rather be
>     allowed to
>     >> delete everything, except any existing locks?
>     >> Then rather than turn the remaining locks into locks, it would be
>     >> enough to leave them.
>     >>
>     >>> For the evict method, this is not really a problem, but
>     evictAll would
>     >>> imply taking all entries and replacing them with Locks, and
>     that might not
>     >>> perform very well in a distributed-cache scenario.
>     >>>
>     >>> Ideally, lock entries would be stored separately than actual
>     cached value
>     >>> entries, and this problem would be fixed in a much cleaner
>     fashion.
>     >> I'd leave this as a detail to the Cache implementation, some
>     might be
>     >> able to perform some operation more efficiently.
>     >> Probably a good idea to clarify this expectation on the javadocs of
>     >> the SPI methods.
>     >>
>     >> Thanks,
>     >> Sanne
>     >>
>     >>
>     >>> Let me know what you think about this.
>     >>>
>     >>> Vlad
>     >>> _______________________________________________
>     >>> hibernate-dev mailing list
>     >>> hibernate-dev at lists.jboss.org
>     <mailto:hibernate-dev at lists.jboss.org>
>     >>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>     > _______________________________________________
>     > hibernate-dev mailing list
>     > hibernate-dev at lists.jboss.org <mailto:hibernate-dev at lists.jboss.org>
>     > https://lists.jboss.org/mailman/listinfo/hibernate-dev
>
>
>     --
>     Radim Vansa <rvansa at redhat.com <mailto:rvansa at redhat.com>>
>     JBoss Performance Team
>
>     _______________________________________________
>     hibernate-dev mailing list
>     hibernate-dev at lists.jboss.org <mailto:hibernate-dev at lists.jboss.org>
>     https://lists.jboss.org/mailman/listinfo/hibernate-dev
>
>

-- 
Radim Vansa <rvansa at redhat.com>
JBoss Performance Team