[hibernate-dev] Consistency guarantees of second level cache

Radim Vansa rvansa at redhat.com
Thu Sep 10 03:47:32 EDT 2015


On 09/09/2015 06:16 PM, Steve Ebersole wrote:
> Some comments inline and then a general discussion at the end...
>
> On Wed, Sep 9, 2015 at 10:32 AM Radim Vansa <rvansa at redhat.com 
> <mailto:rvansa at redhat.com>> wrote:
>
>     Thanks for correcting the terms, I'll try to use 'isolation'.
>
>     TX2 reading B = 1 is not READ_UNCOMMITTED - value B = 1 was committed
>     long ago (it's the initial value). It's reading A = 2 what can be
>     considered read uncommitted (not isolated enough), but as the
>     cache has
>     nothing to do with that entry, we can't really prevent it - it's
>     already
>     in the DB. So it's really rather a stale read of B. If my terms I
>     wrong,
>     I apologize.
>
>
> I said that "TX2 reading "B->1" before TX1 commits is a question of 
> isolation and preventing READ_UNCOMMITTED access to the data".  In 
> other words TX2 reading "B->1" in your "timeline" is in fact an 
> example of the cache preventing READ_UNCOMMITTED access to the data.  
> So we are saying the same thing there.  But after that is where we 
> deviate.
>
> The issue with "isolation" is that it is always relative to "the truth 
> of the system".  This is a historical problem between Hibernate and 
> every manifestation of caching that has come from JBoss ;)  In this 
> usage (second level caching) the cache IS NOT the truth of the system; 
> the database is.
>
> So interestingly the data here *is* stale when looked at from the 
> perspective of the database (again the truth of the system).  And that 
> is fundamentally a problem.

I 100% agree that database is the source of the truth, and that the data 
is stale. My question is whether it is the problem (something we need to 
avoid by default), or whether something similar can be exhibited in 
session caching. Okay, I see that it is the problem. By the way the 
current implementation does not suffer of that, I am rather exploring 
further optimizations.

Therefore, relaxing this should go to the nonstrict read-write mode.

>
>
>     "as close together as possible" is not enough - either you allow
>     certain
>     situation to happen (although you might try to minimize how often), or
>     you guarantee that it does not happen. So, do I understand it
>     correctly
>     that 2LC should check ' hibernate.connection.isolation' and behave
>     accordingly?
>
>
> Again, the problem is that you are registering your own syncs to do 
> things.  I understand that getting these 2 "events" as close together 
> as possible is just minimizing the risk.  But thats is an important 
> minimization.  Yes you still need to decide what to do when something 
> happens to occur between them.  But minimizing those cases (by 
> shrinking the gap) is important.
>
> And in regards to 'hibernate.connection.isolation'.. uh, no, 
> absolutely not.  I never said that.  Not even sure how you go there..

I've diverted a bit here - I have just looked up how is the isolation 
level set. If you set it for READ_UNCOMMITTED, there's no need to have 
the cache in sync with DB providing non-isolated results. But I'd rather 
not abuse this configuration option.

>
>     In 2LC code I am sometimes registering synchronizations but always
>     through
>
>     SessionImplementor.getTransactionCoordinator()
>     .getLocalSynchronizations().registerSynchronization(...)
>
>     - I hope this is the right way and not asking for trouble. I usually
>     just need to do something when I know whether the DB has written the
>     data or not - Hibernate calls the *AccessStrategy methods well
>     enough in
>     the beforeCompletion part (or I should rather say during flush()) but
>     sometimes I need to delegate some work to the afterCompletion part.
>
>
> Well let's look at the example you gave in detail.  And for reference, 
> this is outlined in the EntityRegionAccessStrategy javadocs.
>
> So for an UPDATE to an entity we'd get the following call sequence:
>
> 1) Session1 transaction begins
> 2) Session1 flush is triggered; we deem that both A and B have 
> changed and need to be written to the database.
> 2.a) SQL UPDATE issued for A
> 2.b) EntityRegionAccessStrategy#update called for A
> 2.c) SQL UPDATE issued for B
> 2.d) EntityRegionAccessStrategy#update called for B
> 3) Session1 transaction commits[1]
> 3.a) "before completion callbacks" (for this discussion, there are none)
> 3.b) JDBC transaction committed
> 3.c) "after completion callbacks"
> 3.c.1) EntityRegionAccessStrategy#afterUpdate called for A
> 3.c.2) EntityRegionAccessStrategy#afterUpdate called for B
>
> And again, that is the flow outlined in EntityRegionAccessStrategy:
> <li><b>UPDATES</b> : {@link #lockItem} -> {@link #update} -> {@link 
> #afterUpdate}</li>
>
> So I am still not sure why you need to register a Synchronization.  
> You already get callbacks for "after completion".  Perhaps you meant 
> that there are times you need to do something during "before completion"?

No, I need to do work in afterCompletion, but I need to propagate some 
data from the #update or #remove call to the afterCompletion part. 
#afterUpdate accepts SoftLock instance, but this one is acquired in 
#lock call (when I don't know what will be happening - update, remove or 
nothing?), and the SoftLock instance is not passed to #update method, then.

Radim

>
>
> [1] the exact "how* in (3) will vary, but the general ordering will 
> remain the same.  Making this ordering consistent was one of the main 
> drivers of redesigning the transaction handling in 5.0
>
>


-- 
Radim Vansa <rvansa at redhat.com>
JBoss Performance Team



More information about the hibernate-dev mailing list