Re: [hibernate-dev] HHH-10162 Inheritance and L2 cache

Thursday, 16 March 2017

Hey Gail,

comments inline..

Am 15.03.2017 um 23:21 schrieb Gail Badner:
...
 Hi Christian,

 More comments below...

 On Fri, Mar 3, 2017 at 4:38 PM, Christian Beikov 
 <christian.beikov(a)gmail.com <mailto:christian.beikov@gmail.com>> wrote:

     Thanks for the comments, I have updated the PR.

     I think that it is important to have a complete fix and I am
     already working on that, but this issue was specifically about the
     L2 cache, so I wanted to keep that stuff separate.

     Here are some other polymorphism related issues:

       * https://hibernate.atlassian.net/browse/HHH-9845
         <https://hibernate.atlassian.net/browse/HHH-9845>
       * https://hibernate.atlassian.net/browse/HHH-11280
         <https://hibernate.atlassian.net/browse/HHH-11280>
       * https://hibernate.atlassian.net/browse/HHH-4742
         <https://hibernate.atlassian.net/browse/HHH-4742>
       * https://hibernate.atlassian.net/browse/HHH-2927
         <https://hibernate.atlassian.net/browse/HHH-2927>

 Thanks for looking into this. It is definitely and area that 
 can/should be improved.

     The fix provided by Josh Landin for HHH-11280 would break
     reference equality and I think that is something we all want to
     keep. Which brings us to the actual problems.

 I agree. We definitely do not want to break reference equality.

     1. Whenever we encounter an association that is of a polymorphic
     entity type, we should actually internally, regardless of what the
     user configures, always fetch that association with subtypes.
     Otherwise we could run into that same problem again, that we
     "pollute" the L1 cache with a proxy of the wrong type. Since we
     want to keep reference equality, we can't replace that instance.

 IIUC, this is only a problem if the association is with a polymorphic 
 entity Class that has a subclass. Since a fix would probably affect 
 performance, I think it would be best to only deal with this 
 particular case. Maybe you are saying the same thing; I just wanted to 
 clarify. Yep saying the same thing.
...

 What you are suggesting is to eagerly fetch the association. IIUC, a 
 workaround is to map such associations as eager. That way a proxy 
 would not be created, avoiding the problem altogether. Correct.
...

 IMO, if the entity is mapped as lazy, it really should be 
 lazily-loaded. That said, I think that it would be worthwhile for 
 Hibernate to determine the concrete subclass for the associated entity 
 and create the appropriate proxy. I can think of some alternatives, 
 but I think they would all affect performance. Maybe there should be a 
 property for indicating the strategy to use. Some strategies that come 
 to mind are:

 1) by default, create an uninitialized proxy using the type indicated 
 in the mapping (as is done currently);
 2) try loading from second-level cache (as done in your fix for 
 HHH-10162); if the entity is not found in the cache, then execute a 
 query to determine the associated entity's concrete subclass;
 3) add a join to the original query to determine the associated 
 entity's concrete subclass, and create the appropriate uninitialized 
 proxy. I roughly thought of the same strategies, but I'd like to propose a 1b)

strategy that would also check the L2 cache in case of a polymorphic 
association. So maybe the config for whether a L2 lookup should be done 
should be a "separate" and independent config.
...

 You mentioned that you are working on something. Is tihs along the 
 lines of what you are thinking? Did you have other ideas? Just what I wrote up so
far.
...

     2. When using bytecode enhancement, the loading of *ToOne
     instances could be deferred to the first access.

 Yes, this is another workaround. I wouldn't say workaround but performance
improvement :)
...

     3. Actually it would also be possible if property access is used
     without bytecode enhancement, but that would require to use a
     special proxy for every object that might contain a
     non-initialized polymorphic association. The speciality about it
     is that it loads the association on first access.

 I'm not sure I'm following this. When querying for an entity A like
"FROM A", currently we always return 
A instances although we could return special proxy A instances. The 
special thing about the proxy is, that it has all state initialized 
except for polymorphic associations. These associations are only loaded 
from the DB when accessing them through their respective property 
accessors. Note that this proxy is also something that we can use for 
optional "toParent" one-to-one associations.

The only proxies that we have right now are for uninitialized 
associations which trigger loading the full entity on non-id property 
access. Either we extend the existing proxy implementation by the 
explained functionality, or we create a separate proxy type.
...

     4. Another thing that we should consider is "merging" of fetched
     state into existing instances of the L1 cache. Imagine a user
     loads an entity X and some of it's associations are lazy loaded.
     Then the user sets of a query that would fetch that association.
     Currently the associations will be left untouched because we skip
     any further processing as soon as we encounter the instance for X
     in the L1 cache. What we should actually do is, merge any fetched
     state into the instance if possible.

 +1; I think this is a good improvement. Note that this improvement kind of requires
that there are no wrongly 
typed proxy instance in the L1 cache. We could actually also "just" 
merge the state that we can i.e. ignoring subtype state.
...

     Since 1. will probably affect performance, it might only make
     sense to do it along with 2. or 3.

     Regardless of these improvements, I think we should apply the L2
     cache fix and handle the rest later.

     What do you say?

 I think it would be better to come up with a hibernate property and 
 strategies, and allow an application to opt into a strategy, rather 
 than making a change that can affect performance by default. I agree that users
should have the possibility to configure this. The 
question is, should this be a global config or per-entity or maybe even 
per-association? When all the performance optimizations are implemented 
I see no reason for adding a per-entity or per-association config, a 
global default should be enough. After all, the additional query for 
loading the subtype will only happen when a user tries to get his hands 
on the instance.

What do you say, how should we proceed? I'd start by creating/collecting 
JIRAs for the issues and posting them here so you can track them.
Next I'd like to improve the proxy implementation as described in 3) as 
that would improve performance in other scenarios too.

Thought? Feedback?

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [hibernate-dev] HHH-10162 Inheritance and L2 cache