[hibernate-dev] [HV] Memory footprint improvements

Mon Jul 24 10:43:59 EDT 2017

Hi all,

One general challenge for HV is that we cannot easily determine the set of
validated classes upfront. Neither is there means of listing all validated
types explicitly (akin to JPA's <class> in persistence.xml), nor is it
sufficient to check classes for a limited set of well-known annotations to
identify the relevant classes (as on of BV's premises is support for custom
constraint types).

Building up the meta-data eagerly would avoid retaining memory for the "raw
model", on the other hand it means we build up instances in the "aggregated
model" for classes which potentially are never validated. So I'm not really
sure whether that's better. But if we decide to build up the metadata
eagerly, this should be based on Jandex IMO, otherwise we'd end up with
re-implementing lots of that ourselves.

On specific question would be how to limit the scope of such scan, as there
isn't the notion of something similar to "persistence archives" in JPA. But
then we wouldn't want to scan all the JARs on the classpath. Limiting it to
the classes of the current deployment seems reasonable in WildFly. But in
SE (say when using HV with Spring on Tomcat) we'd need a way to express
which JARs to scan.

> So my vote would go for 1/ and I would accept the additional startup cost.

+1, caching instances of the raw model seems not too useful, e.g. typically
people won't derive dozens of sub-types from one base class in their model
(which is the case where the current caching would be beneficial, as the
metadata for the base class would only be retrieved once).

A variation of the current scheme could be to process all types of a
hierarchy upon first validation of any type in that hierarchy. Say we have
a class A with sub-classes B and C. When validating B for the first time,
we'd build up the metadata for A, B and C, caching the raw metadata for A
just in the context of this call. This requires though to reliably identify
all sub-classes of A, which cannot be done using standard reflection.
Jandex could help (though we cannot keep the index at runtime, so we'd have
to retrieve and store just the inheritance relationships from the index).

> I once thought about completely removing the method metadata if the method
> wasn't annotated at all but then I thought that the overriding rules would
> prevent us to do that.
>
> Gunnar, am I right on this?

The overriding rules are checked when merging the raw metadata into the
aggregated model, so indeed I'd think we could omit that method metadata
from the aggregated model in this case. Definitely worth exploring. Same
for property metadata. I've filed
https://hibernate.atlassian.net/browse/HV-1447 for looking into this. Calls
to BeanMetaData#getMetaDataFor() could return an empty placeholder
implementation then.

Another thing I've been pondering on is the implementation of the public
metamodel (defined by BV API). Currently that's independent from our own
(aggregated) metamodel and thus allocates another bit of memory. We could
try and implement the BV metamodel interfaces directly with our own
metamodel model types. E.g. BeanMetaData#getBeanDescriptor() would then
simply return itself. This should save some memory currently allocated for
the nodes of the external model.

--Gunnar

2017-07-23 22:23 GMT+02:00 Guillaume Smet <guillaume.smet at gmail.com>:

> On Sat, Jul 22, 2017 at 2:44 PM, Sanne Grinovero <sanne at hibernate.org>
> wrote:
>
> > I'm not familiar enough with the whole picture but I strongly suspect you
> > should explore ways to get out of this lazy initialization strategy.
> >
>
> We have a Jandex POC but it's far from being ready for prime time.
>
> Not something we will be able to tackle soon.
>
>
> > Maybe keep building it lazily during bootstrap(s) but then add a "drop
> > cached metadata" hook which containers could invoke explicitly at the end
> > of bootstrap of an app?
> > Worst case you'll have to rebuild the metadata on demand.
> >
>
> Currently, at the end of the bootstrap of an app, you might have not
> validated a bean at all, so you don't have any metadata. But I suppose you
> were suggesting that we would have tackled the "load metadata lazily"
> subject before that. If so, of course, we wouldn't have to keep this
> metadata anymore.
>
>
> > So, here, we have to find a compromise:
> >
> > 1/ either favor the memory footprint but the annotation of a given class
> > could be analyzed several times in the case of a class hierarchy
> > 2/ or favor the startup cost (it's not a runtime cost, it's paid only
> once
> > when validating the first bean of a given class) and have a higher memory
> > footprint
> >
> >
> > I guess my proposal above is 3/, trying to have both benefits.
> >
>
> Yeah, not something we can fix soon.
>
>
> > Usually, Yoann and I are on the "Let's do it" side and the others on the
> > "I'm not sure it's worth it" when it comes to CollectionHelper, but
> > considering how well the first round has paid, I think we should at least
> > give it a try.
> >
> >
> > I'm also quite sure it's worth applying such optimisations.
> > I'm only skeptical about the value of sharing such code via shared
> > libraries.
> >
> > I'd even go a step further : try avoiding wrapping into immutable when
> > those collections are exposed exclusively to code we directly control.
> The
> > JIT can do much magic but it won't avoid allocating the wrappers so
> that's
> > not cheap at all, but that's of course a maintenance / clean code /
> > performance tradeoff.
> > Would be great to validate automatically that we treat them as
> effectively
> > immutable, maybe that's possible via annotations and some code validation
> > tools?
> >
>
> Yeah, I think the wrapping is here to stay for now.
>
>
> >
> > I once thought about completely removing the method metadata if the
> method
> > wasn't annotated at all but then I thought that the overriding rules
> would
> > prevent us to do that.
> >
> > Gunnar, am I right on this?
> >
> > So basically, I think we can't really do anything about this.
> >
> >
> > Drop it as soon as we figure it's not useful?
> >
>
> Unfortunately, as we load the metadata lazily, they can be useful at any
> time. That's the issue.
>
>
> >
> > I also thought that it was useless to look for annotations on
> > java.lang.Object but then again I think the overriding rules force us to
> do
> > so.
> >
> >
> > I'm not following here. Why is it not safe to skip annotations on Object?
> >
>
> There are some overriding rules you have to follow in BV.
>
> For instance:
>
>
> *If a sub type overrides/implements a method originally defined in several
> parallel types of the hierarchy (e.g. two interfaces not extending each
> other, or a class and an interface not implemented by said class), no
> parameter constraints may be declared for that method at all nor parameters
> be marked for cascaded validation.*
> So, you can't simply withdraw the metadata because there are no HV
> information on the methods. Just the fact that a method is here has
> consequences.
>
> --
> Guillaume
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>