Adding Steve...
On Thu, Mar 16, 2017 at 1:44 PM, Christian Beikov <
christian.beikov(a)gmail.com> wrote:
Hey Gail,
comments inline..
Am 15.03.2017 um 23:21 schrieb Gail Badner:
Hi Christian,
More comments below...
On Fri, Mar 3, 2017 at 4:38 PM, Christian Beikov <
christian.beikov(a)gmail.com> wrote:
> Thanks for the comments, I have updated the PR.
>
> I think that it is important to have a complete fix and I am already
> working on that, but this issue was specifically about the L2 cache, so I
> wanted to keep that stuff separate.
>
> Here are some other polymorphism related issues:
>
> -
https://hibernate.atlassian.net/browse/HHH-9845
> -
https://hibernate.atlassian.net/browse/HHH-11280
> -
https://hibernate.atlassian.net/browse/HHH-4742
> -
https://hibernate.atlassian.net/browse/HHH-2927
>
> Thanks for looking into this. It is definitely and area that can/should
be improved.
> The fix provided by Josh Landin for HHH-11280 would break reference
> equality and I think that is something we all want to keep. Which brings us
> to the actual problems.
>
I agree. We definitely do not want to break reference equality.
>
> 1. Whenever we encounter an association that is of a polymorphic entity
> type, we should actually internally, regardless of what the user
> configures, always fetch that association with subtypes. Otherwise we could
> run into that same problem again, that we "pollute" the L1 cache with a
> proxy of the wrong type. Since we want to keep reference equality, we can't
> replace that instance.
>
IIUC, this is only a problem if the association is with a polymorphic
entity Class that has a subclass. Since a fix would probably affect
performance, I think it would be best to only deal with this particular
case. Maybe you are saying the same thing; I just wanted to clarify.
Yep saying the same thing.
What you are suggesting is to eagerly fetch the association. IIUC, a
workaround is to map such associations as eager. That way a proxy would not
be created, avoiding the problem altogether.
Correct.
IMO, if the entity is mapped as lazy, it really should be lazily-loaded.
That said, I think that it would be worthwhile for Hibernate to determine
the concrete subclass for the associated entity and create the appropriate
proxy. I can think of some alternatives, but I think they would all affect
performance. Maybe there should be a property for indicating the strategy
to use. Some strategies that come to mind are:
1) by default, create an uninitialized proxy using the type indicated in
the mapping (as is done currently);
2) try loading from second-level cache (as done in your fix for
HHH-10162); if the entity is not found in the cache, then execute a query
to determine the associated entity's concrete subclass;
3) add a join to the original query to determine the associated entity's
concrete subclass, and create the appropriate uninitialized proxy.
I roughly thought of the same strategies, but I'd like to propose a 1b)
strategy that would also check the L2 cache in case of a polymorphic
association. So maybe the config for whether a L2 lookup should be done
should be a "separate" and independent config.
You mentioned that you are working on something. Is tihs along the lines
of what you are thinking? Did you have other ideas?
Just what I wrote up so far.
2. When using bytecode enhancement, the loading of *ToOne instances could
> be deferred to the first access.
>
Yes, this is another workaround.
I wouldn't say workaround but performance improvement :)
>
> 3. Actually it would also be possible if property access is used without
> bytecode enhancement, but that would require to use a special proxy for
> every object that might contain a non-initialized polymorphic association.
> The speciality about it is that it loads the association on first access.
>
>
I'm not sure I'm following this.
When querying for an entity A like "FROM A", currently we always return A
instances although we could return special proxy A instances. The special
thing about the proxy is, that it has all state initialized except for
polymorphic associations. These associations are only loaded from the DB
when accessing them through their respective property accessors. Note that
this proxy is also something that we can use for optional "toParent"
one-to-one associations.
The only proxies that we have right now are for uninitialized associations
which trigger loading the full entity on non-id property access. Either we
extend the existing proxy implementation by the explained functionality, or
we create a separate proxy type.
> 4. Another thing that we should consider is "merging" of fetched state
> into existing instances of the L1 cache. Imagine a user loads an entity X
> and some of it's associations are lazy loaded. Then the user sets of a
> query that would fetch that association. Currently the associations will be
> left untouched because we skip any further processing as soon as we
> encounter the instance for X in the L1 cache. What we should actually do
> is, merge any fetched state into the instance if possible.
>
+1; I think this is a good improvement.
Note that this improvement kind of requires that there are no wrongly
typed proxy instance in the L1 cache. We could actually also "just" merge
the state that we can i.e. ignoring subtype state.
> Since 1. will probably affect performance, it might only make sense to do
> it along with 2. or 3.
>
> Regardless of these improvements, I think we should apply the L2 cache
> fix and handle the rest later.
>
> What do you say?
>
I think it would be better to come up with a hibernate property and
strategies, and allow an application to opt into a strategy, rather than
making a change that can affect performance by default.
I agree that users should have the possibility to configure this. The
question is, should this be a global config or per-entity or maybe even
per-association? When all the performance optimizations are implemented I
see no reason for adding a per-entity or per-association config, a global
default should be enough. After all, the additional query for loading the
subtype will only happen when a user tries to get his hands on the instance.
What do you say, how should we proceed? I'd start by creating/collecting
JIRAs for the issues and posting them here so you can track them.
Next I'd like to improve the proxy implementation as described in 3) as
that would improve performance in other scenarios too.
Thought? Feedback?