Independently from which one we read from first, won't there
always be
a risk of forgetting the other input processing blocks?
With that in mind: yes, even if what I propose, there will still be a risk
of forgetting to add a feature to one API or the other, but I think the
risk will be lower if we revert how it works, because then testing the
annotations mode will also test the programmatic API. And most of the time,
our tests use annotations, not the programmatic API.
1# F({Annotations, ProgrammaticAPI}) -> Actual applied
Metadata.
2# Z({Free Form}) -> Actual applied Metadata & custom walkers.
My point being that the two operations F() and Z() are intrinsically
based on both different input APIs and requirements.
I suppose it all comes down to how different we think those metadata
providers will be. Bear in mind that currently, the
AnnotationMetadataProvider does more than just translate annotations to
metadata; it performs some validity checks and handles additional features
such as IndexedEmbedded depth limits, analyzer building, etc.
I concur that the strategy to "simulate" annotations looks
weird, but
it's working fine, while such a significant refactoring poses various
risks (both our time and likely regressions) for a benefit I'm not
convinced of ;)
Let's do some convincing then :) Some facts:
- The source code of AnnotationMetadataProvider is 2000 lines long. Two
thousands. Twenty times a hundred.
- AnnotationMetadataProvider is the file that appears in the most
commits in the git history, after Log (source: git log --pretty=format:
--name-only | grep '.*\.java' | sort | uniq -c | sort -rg | head -20).
Thus this is probably the file we changed the most in the past.
- We've already planned to make several changes on 6.0 that will
impact AnnotationMetadataProvider quite heavily, thus regressions are
already a problem we'll have to tackle.
I am mainly talking about these, but I'm sure there are others:
- Making metadata themselves a bit safer (true immutability for instance)
- Implement some kind of walker for document building.
- FieldBridge 2.0 (HSEARCH-2055)
- and splitting internal metadata between generic metadata and
backend-specific (lucene, ES) metadata (HSEARCH-2462).
So I won't pretend that changing this file poses no risk at all, but rather
the opposite: if there is any file that poses a threat, it's this one.
Wouldn't it be great if we could reduce this risk from 6.0 onwards? For
example, by splitting it up, moving one responsibility (the interpretation
of annotations) to a different component?
Currently we have this flow of information:
- Programmatic API => o.h.annotations.common.reflection.MetadataProvider
=> o.h.search.engine.metadata.impl.AnnotationMetadataProvider => Hibernate
Search
- or Annotated classes
=> o.h.annotations.common.reflection.java.JavaMetadataProvider =>
o.h.search.engine.metadata.impl.AnnotationMetadataProvider => Hibernate
Search
(yes, we have two types of metadata providers, one coming from
hibernate-annotations and the other being our own...)
What I'm proposing is:
- Programmatic Mapping API =>
o.h.search.engine.metadata.impl.SomeNewMetadataProvider => Hibernate Search
- or Annotated classes =>
o.h.annotations.common.reflection.java.JavaMetadataProvider => Some
annotation walker => Programmatic Mapping API =>
o.h.search.engine.metadata.impl.SomeNewMetadataProvider => Hibernate Search
- or Free-form mapping API => Programmatic Mapping API =>
o.h.search.engine.metadata.impl.SomeNewMetadataProvider => Hibernate Search
The key here is "Some annotation walker": it will take *some* of the
complexity out of the metadata provider. For instance all the reflection
part (iteration through fields and getters). Or field resolution for
annotations such as NumericField(forField), or Facet(forField). Or JPA
annotation handling.
Note that, in order to more easily introduce features specific to one type
of mapping or another, and to be more independent from user APIs, we may
ultimately introduce some internal API:
- Programmatic Mapping API => Internal Mapping API =>
o.h.search.engine.metadata.impl.SomeNewMetadataProvider => Hibernate Search
- or Annotated classes =>
o.h.annotations.common.reflection.java.JavaMetadataProvider => Somme
annotation walker => Internal Mapping API =>
o.h.search.engine.metadata.impl.SomeNewMetadataProvider => Hibernate Search
- or Free-form mapping API => Internal Mapping API =>
o.h.search.engine.metadata.impl.SomeNewMetadataProvider => Hibernate Search
But we'd run in the same problem we have now, which is we sometimes forget
to add features to the programmatic API. At least we would have improved
the metadata provider, I guess.
We don't have to do this now, mind you, but I expect it to be very clearly
necessary when we'll introduce free-form support. Maybe we could just agree
that ideally, that's what we should do? Then we'll see how much the
metadata provider has changed when we'll implement free-form support, and
decide whether one more change is more risky or not.
Yoann Rodière <yoann(a)hibernate.org>
Hibernate NoORM Team
On 27 March 2017 at 12:39, Sanne Grinovero <sanne(a)hibernate.org> wrote:
> I don't mean this to be a strict "No", but I'm not convinced for
now
> on the benefits such an exercise in refactoring brings..
>
> In a nutshell the goal is reading from {Annotations, ProgrammaticAPI}
> -> Actual applied Metadata.
>
Independently from which one we read from first, won't there
always be
a risk of forgetting the other input processing blocks?
>
> Having one transformed into the other - like it is today - at least
> allows to make sure the interpretation of metadata is very consistent
> (when we don't forget about implementing support for an annotation).
> Keep in mind that we don't really create annotations: we just make
> sure to behave like they existed.
>
> You make a good point on free-form. We'll certainly need another
> "input -> actually applied Metadata", but in that case there's no
need
> to be consistent with how annotations would be applied; in fact quite
> the opposite: I expect it to be beneficial to be able to do some
> things in a different way.
>
> So I DON'T expect this to evolve into:
> F({Annotations, ProgrammaticAPI, Free Form}) -> Actual applied
> Metadata.
> But rather two different operating modes:
> 1# F({Annotations, ProgrammaticAPI}) -> Actual applied Metadata.
> 2# Z({Free Form}) -> Actual applied Metadata & custom walkers.
>
> My point being that the two operations F() and Z() are intrinsically
> based on both different input APIs and requirements.
> While the purpose of the current Programmatic API, the fact that the
> working is very similar to annotations also has the side-effect of
> making the API intuitive for people who wish they could have added the
> annotation, but can't for some reason and have to fallback to this
> API: the picture they have in mind about their task is still similar
> to "add this annotation over there".
>
I concur that the strategy to "simulate" annotations looks
weird, but
it's working fine, while such a significant refactoring poses various
risks (both our time and likely regressions) for a benefit I'm not
convinced of ;)
>
> Thanks,
> Sanne
>
> On 27 March 2017 at 09:06, Yoann Rodiere <yoann(a)hibernate.org> wrote:
> > Hello,
> >
> > Currently, the medata definition mechanisms in Search work this way:
> >
> > - the primary way to define metadata is using annotations
> > - the secondary way to define metadata is programmatic, *but it only
> > instantiates annotations,* simulating annotated entities
> > - classes needing to access those "low-level" metadata
> > (AnnotationMetadataProvider in particular) only manipulate annotations
> >
> > I'm wondering if we should change that, flipping the metadata definition
> > upside-down: the programmatic definition would be the main one, with a
> > clean, annotation-free low-level metadata model, and annotations would
> > merely be translated to this low-level metadata using a walker and the
> > programmatic API.
> >
> > My first argument is that creating "simulated" annotations is rather
odd,
> > but I'll grant you it's hardly receivable.
> > But I can see other, more objective reasons:
> >
> > - We periodically notice missing methods in the programmatic API ([1],
> > [2], [3], [4]), because we thought "annotations first" and forgot
> about the
> > programmatic API. If annotation processing was to rely on programmatic
> > mapping, this "annotations first" thinking would not be a problem
> anymore,
> > but rather a good thing: we would have to implement both the
> programmatic
> > API and the annotations in order to make it work.
> > - If we want to support programmatic mapping for "free-form" (i.e.
> > non-POJO) entities, we will need to be more generic than what
> annotations
> > allow at some point. We already spotted the problem of using
> "Class<?>" to
> > denote entity types, but there may be more. For instance denoting
> property
> > identifiers, or property types, ... It just doesn't feel future-proof
> to
> > rely on an intrinsically Java way of modeling metadata (the
> annotations)
> > and try to model non-Java things with it...
> >
> > What do you think? Are there any objections to making the programmatic
> API
> > the primary way to define metadata? Note that I'm not talking about
> making
> > users use it in priority (it won't change anything for them), just about
> > making it more central in our architecture.
> >
> >
> > |1]
> >
http://stackoverflow.com/questions/43006746/hibernate-
> search-5-2-programmatic-configuration-of-facet-field
> > [2]
https://hibernate.atlassian.net/browse/HSEARCH-1764
> > [3]
https://hibernate.atlassian.net/browse/HSEARCH-2199
> > [4]
https://hibernate.atlassian.net/browse/HSEARCH-1079
> >
> > Yoann Rodière <yoann(a)hibernate.org>
> > Hibernate NoORM Team
> > _______________________________________________
> > hibernate-dev mailing list
> > hibernate-dev(a)lists.jboss.org
> >
https://lists.jboss.org/mailman/listinfo/hibernate-dev
>