[hibernate-dev] ReadWrite 2LC "read uncommitted" data anomaly
Vlad Mihalcea
mihalcea.vlad at gmail.com
Wed Apr 6 06:17:32 EDT 2016
Hi Radim,
I pushed this fix on master and 5.1, and I managed to add an EHCache where
the same behavior was replicated as well:
https://github.com/hibernate/hibernate-orm/commit/cbdab9d87f05b4255c7930a32fe995f87f0f3e0b
For Infinispan, I think it's better if you can investigate if this is an
issue or if the change does not break anything either (all Infinispan tests
ran fine, so hopefully this should not be the case).
Thanks,
Vlad
On Wed, Apr 6, 2016 at 12:58 PM, Radim Vansa <rvansa at redhat.com> wrote:
> On 04/05/2016 04:13 PM, Vlad Mihalcea wrote:
> > Hi,
> >
> > I'd definitely fix it for the refresh operation, which does an implicit
> > cache eviction too.
> > In this case, the proposed PR solution is fine since it simply locks the
> > entry right after it is evicted from the cache and releases the lock
> after
> > the transaction is ended.
> > This way, we won't push an uncommitted entity into 2PL during the
> two-phase
> > loading phase that is triggered by the refresh operation.
> >
> > For sessionFactory.getCache.evictEntity/evictCollection, if there is a
> > current Session going on, we could propagate a
> > CacheEvictEvent/CollectionCacheEvictEvent which can apply the lock on
> that
> > particular entity/collection, and we release it right after the current
> > transaction is ended, similar to what refresh should do as well.
> >
> > For every other use case, like evictAll/evictRegion, we should just
> > document the behavior.
> >
> > I saw that Radim has added such a warning for Infinispan in the new User
> > Guide:
> >
> > read-write mode is supported on non-transactional distributed/replicated
> >> caches, however, eviction should not be used in this configuration. Use
> of
> >> eviction can lead to consistency issues.
>
> This is a different matter; in Infinispan 2LC impl you store locks and
> entries either in two different caches (the entries cache is
> invalidation, locks is local), or in single cache
> (replicated/distributed). As we don't want to lose locks randomly, and
> eviction picks entries unpredictably, its use is discouraged.
>
> I think that this issue does not apply to Infinispan with the
> invalidation configuration, since evict/evictAll does not remove any
> locks, and the lock blocks further updates (including putFromLoads) to
> the entry in cache until the transaction commits. In case of
> replicated/distributed cache, it seems that the evict is ignored after
> update, but evictAll is not (that would qualify as a bug) - so after
> evictAll you could observe the uncommitted read. Nevertheless I would
> have to test this.
>
> Radim
>
> >
> > Unfortunately, the EhCache documentation
> > <
> http://www.ehcache.org/documentation/2.8/integrations/hibernate.html#read-write
> >
> > makes a wrong assumption:
> >
> > read-write - Caches data that is sometimes updated while maintaining the
> >> semantics of “read committed” isolation level.
> >
> > Vlad
> >
> > On Tue, Apr 5, 2016 at 4:30 PM, Sanne Grinovero <sanne at hibernate.org>
> wrote:
> >
> >> On 5 April 2016 at 14:11, Vlad Mihalcea <mihalcea.vlad at gmail.com>
> wrote:
> >>> Hi,
> >>>
> >>> While reviewing the PR for this issue:
> >>>
> >>> https://hibernate.atlassian.net/browse/HHH-10649
> >>>
> >>> I realized that the ReadWrite cache concurrency strategy has a flaw
> that
> >>> permits "read uncommitted" anomalies.
> >>> The RW cache concurrency strategy guards any modifications with Lock
> >>> entries, as explained in this post that I wrote some time ago:
> >>>
> >>>
> >>
> http://vladmihalcea.com/2015/05/25/how-does-hibernate-read_write-cacheconcurrencystrategy-work/
> >>> Every time we update/delete an entry, a Lock is put in the cache under
> >> the
> >>> entity key, and, this way, "read uncommitted" anomalies should be
> >> prevented.
> >>> The problem comes when entries are evicted either explicitly:
> >>>
> >>> session.getSessionFactory().getCache().evictEntity(
> CacheableItem.class,
> >>> item1.getId() );
> >>>
> >>> or implicitly:
> >>>
> >>> session.refresh( item1 );
> >> Good catch!
> >>
> >> I think this is caused as we generally don't expect the evict
> >> operation to be controlled explicitly.
> >> In my personal experience, I would use the evictAll method to nuke the
> >> cache state after some significant operation, like restoring a
> >> backup.. and no other Session would have been opened in the meantime.
> >> I never used an explicit single-shot evict so I can't say what the use
> >> case would be.
> >>
> >> But of course you're right that it might be used differently, or at
> >> least such a limitation should be clear.
> >>
> >>> During eviction, the 2PL will remove the Lock entry, and if the user
> >>> attempts to load the entity anew (in the same transaction that has
> modified
> >>> the entity but which is not committed yet), an uncommitted change
> could be
> >>> propagated to the 2PL.
> >>>
> >>> This issue is replicated by the PR associated to this Jira issue, and I
> >>> also replicated it with manual eviction and entity loading.
> >>>
> >>> To fix it, the RW cache concurrency strategy should not delete entries
> from
> >>> 2PL upon eviction, but instead it should turn them in Lock entries.
> >> I'm not sure I understood this part. Shouldn't it rather be allowed to
> >> delete everything, except any existing locks?
> >> Then rather than turn the remaining locks into locks, it would be
> >> enough to leave them.
> >>
> >>> For the evict method, this is not really a problem, but evictAll would
> >>> imply taking all entries and replacing them with Locks, and that might
> not
> >>> perform very well in a distributed-cache scenario.
> >>>
> >>> Ideally, lock entries would be stored separately than actual cached
> value
> >>> entries, and this problem would be fixed in a much cleaner fashion.
> >> I'd leave this as a detail to the Cache implementation, some might be
> >> able to perform some operation more efficiently.
> >> Probably a good idea to clarify this expectation on the javadocs of
> >> the SPI methods.
> >>
> >> Thanks,
> >> Sanne
> >>
> >>
> >>> Let me know what you think about this.
> >>>
> >>> Vlad
> >>> _______________________________________________
> >>> hibernate-dev mailing list
> >>> hibernate-dev at lists.jboss.org
> >>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
> > _______________________________________________
> > hibernate-dev mailing list
> > hibernate-dev at lists.jboss.org
> > https://lists.jboss.org/mailman/listinfo/hibernate-dev
>
>
> --
> Radim Vansa <rvansa at redhat.com>
> JBoss Performance Team
>
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>
More information about the hibernate-dev
mailing list