[hibernate-dev] [HV] Memory footprint improvements

Wed Jul 19 09:19:54 EDT 2017

Hi,

Stuart Douglas (in CC) opened an interesting issue here:
https://hibernate.atlassian.net/browse/HV-1437

I already made some good progress here:
https://github.com/hibernate/hibernate-validator/pull/814
but I would appreciate some feedback on a few additional things.

## AnnotationMetaDataProvider#configuredBeans

So, in AnnotationMetaDataProvider, we keep a cache of the annotation
metadata found while building the aggregated BeanMetaData.

It might be useful if you have a class hierarchy for instance as you would
only build the annotation of the parent class once.

But... as we initialize the bean metadata lazily, we need to keep this
cache during the whole lifecycle of the application.

So, here, we have to find a compromise:
1/ either favor the memory footprint but the annotation of a given class
could be analyzed several times in the case of a class hierarchy
2/ or favor the startup cost (it's not a runtime cost, it's paid only once
when validating the first bean of a given class) and have a higher memory
footprint

Usually, in HV, we take the 1/ route so I was a bit surprised 2/ was chosen
here.

Thoughts?

## Collection resizing

So, interestingly, enough, the small CollectionHelper#toImmutable*
utilities I introduced make quite a difference if there are a lot of
empty/one element collections - which is the general case in Stuart's
example.

The idea of these utilities is quite simple: instead of blindly wrapping a
collection using Collections#unmodifiable* to make a collection immutable,
we test the size of the collection and we return a static empty collection
or a singleton collection if the size is 0 or 1 respectively.

So this is nice when the size is 0 or 1 but considering that HV metadata
structures are mostly immutable once they are initialized, I would like to
go a step further and create a collection of the right size when making it
immutable in all the cases.

We don't use that many collection types and we could default to the current
wrapping method if the collection is not ArrayList, HashSet, LinkedHashSet,
HashMap and maybe LinkedHashMap.

Basically, for sets, we would have:
    public static <T> Set<T> toResizedImmutableSet(Set<? extends T> set) {
        switch ( set.size() ) {
            case 0:
                return Collections.emptySet();
            case 1:
                return Collections.singleton( set.iterator().next() );
            default:
                if ( set instanceof LinkedHashSet ) {
                    LinkedHashSet<T> copy = new LinkedHashSet<T>(
set.size(), 1 );
                    copy.addAll( set );
                    return Collections.unmodifiableSet( copy );
                }
                else if ( set instanceof HashSet ) {
                    HashSet<T> copy = new HashSet<T>( set.size(), 1 );
                    copy.addAll( set );
                    return Collections.unmodifiableSet( copy );
                }
                else {
                    return Collections.unmodifiableSet( set );
                }
        }
    }

It's one more memory allocation but I think reducing the runtime memory
footprint would be worth it. Especially for data structures that are very
common (think of the executable metadata, the parameter metadata and so on).

I'm not sure I would generalize it to all our projects but I think it makes
sense for HV where nearly everything is immutable and we have quite a lot
of Maps and Sets in the metadata.

Usually, Yoann and I are on the "Let's do it" side and the others on the
"I'm not sure it's worth it" when it comes to CollectionHelper, but
considering how well the first round has paid, I think we should at least
give it a try.

Thoughts?

## Metadata, metadata, metadata

So, we have a lot of metadata. A. Lot. Especially, we have metadata even
when there are no HV metadata at all.

Think of Keycloak's RealmEntity (
https://github.com/keycloak/keycloak/blob/master/model/jpa/src/main/java/org/keycloak/models/jpa/entities/RealmEntity.java),
it makes a lot of metadata for no useful information.

It's especially the method metadata that are heavy, even if I tried to
reduce them to the minimum in this case.

I once thought about completely removing the method metadata if the method
wasn't annotated at all but then I thought that the overriding rules would
prevent us to do that.

Gunnar, am I right on this?

So basically, I think we can't really do anything about this.

I also thought that it was useless to look for annotations on
java.lang.Object but then again I think the overriding rules force us to do
so.

That's it for today.

Comments welcome.

-- 
Guillaume