[hibernate-dev] [HV] Memory footprint improvements

Sat Jul 22 08:44:51 EDT 2017

On 19 Jul 2017 18:00, "Guillaume Smet" <guillaume.smet at gmail.com> wrote:

Hi,

Stuart Douglas (in CC) opened an interesting issue here:
https://hibernate.atlassian.net/browse/HV-1437

I already made some good progress here:
https://github.com/hibernate/hibernate-validator/pull/814
but I would appreciate some feedback on a few additional things.

## AnnotationMetaDataProvider#configuredBeans

So, in AnnotationMetaDataProvider, we keep a cache of the annotation
metadata found while building the aggregated BeanMetaData.

It might be useful if you have a class hierarchy for instance as you would
only build the annotation of the parent class once.

But... as we initialize the bean metadata lazily, we need to keep this
cache during the whole lifecycle of the application.

I'm not familiar enough with the whole picture but I strongly suspect you
should explore ways to get out of this lazy initialization strategy.

Maybe keep building it lazily during bootstrap(s) but then add a "drop
cached metadata" hook which containers could invoke explicitly at the end
of bootstrap of an app?
Worst case you'll have to rebuild the metadata on demand.

So, here, we have to find a compromise:
1/ either favor the memory footprint but the annotation of a given class
could be analyzed several times in the case of a class hierarchy
2/ or favor the startup cost (it's not a runtime cost, it's paid only once
when validating the first bean of a given class) and have a higher memory
footprint

I guess my proposal above is 3/, trying to have both benefits.

Usually, in HV, we take the 1/ route so I was a bit surprised 2/ was chosen
here.

Thoughts?

## Collection resizing

So, interestingly, enough, the small CollectionHelper#toImmutable*
utilities I introduced make quite a difference if there are a lot of
empty/one element collections - which is the general case in Stuart's
example.

The idea of these utilities is quite simple: instead of blindly wrapping a
collection using Collections#unmodifiable* to make a collection immutable,
we test the size of the collection and we return a static empty collection
or a singleton collection if the size is 0 or 1 respectively.

So this is nice when the size is 0 or 1 but considering that HV metadata
structures are mostly immutable once they are initialized, I would like to
go a step further and create a collection of the right size when making it
immutable in all the cases.

We don't use that many collection types and we could default to the current
wrapping method if the collection is not ArrayList, HashSet, LinkedHashSet,
HashMap and maybe LinkedHashMap.

Basically, for sets, we would have:
    public static <T> Set<T> toResizedImmutableSet(Set<? extends T> set) {
        switch ( set.size() ) {
            case 0:
                return Collections.emptySet();
            case 1:
                return Collections.singleton( set.iterator().next() );
            default:
                if ( set instanceof LinkedHashSet ) {
                    LinkedHashSet<T> copy = new LinkedHashSet<T>(
set.size(), 1 );
                    copy.addAll( set );
                    return Collections.unmodifiableSet( copy );
                }
                else if ( set instanceof HashSet ) {
                    HashSet<T> copy = new HashSet<T>( set.size(), 1 );
                    copy.addAll( set );
                    return Collections.unmodifiableSet( copy );
                }
                else {
                    return Collections.unmodifiableSet( set );
                }
        }
    }

It's one more memory allocation but I think reducing the runtime memory
footprint would be worth it. Especially for data structures that are very
common (think of the executable metadata, the parameter metadata and so on).

I'm not sure I would generalize it to all our projects but I think it makes
sense for HV where nearly everything is immutable and we have quite a lot
of Maps and Sets in the metadata.

Usually, Yoann and I are on the "Let's do it" side and the others on the
"I'm not sure it's worth it" when it comes to CollectionHelper, but
considering how well the first round has paid, I think we should at least
give it a try.

I'm also quite sure it's worth applying such optimisations.
I'm only skeptical about the value of sharing such code via shared
libraries.

I'd even go a step further : try avoiding wrapping into immutable when
those collections are exposed exclusively to code we directly control. The
JIT can do much magic but it won't avoid allocating the wrappers so that's
not cheap at all, but that's of course a maintenance / clean code /
performance tradeoff.
Would be great to validate automatically that we treat them as effectively
immutable, maybe that's possible via annotations and some code validation
tools?

Thoughts?

## Metadata, metadata, metadata

So, we have a lot of metadata. A. Lot. Especially, we have metadata even
when there are no HV metadata at all.

Think of Keycloak's RealmEntity (
https://github.com/keycloak/keycloak/blob/master/model/
jpa/src/main/java/org/keycloak/models/jpa/entities/RealmEntity.java),
it makes a lot of metadata for no useful information.

It's especially the method metadata that are heavy, even if I tried to
reduce them to the minimum in this case.

I once thought about completely removing the method metadata if the method
wasn't annotated at all but then I thought that the overriding rules would
prevent us to do that.

Gunnar, am I right on this?

So basically, I think we can't really do anything about this.

Drop it as soon as we figure it's not useful?

I also thought that it was useless to look for annotations on
java.lang.Object but then again I think the overriding rules force us to do
so.

I'm not following here. Why is it not safe to skip annotations on Object?

That's it for today.

Comments welcome.

--
Guillaume
_______________________________________________
hibernate-dev mailing list
hibernate-dev at lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev