Hi,
2015-08-12 17:46 GMT+02:00 Sanne Grinovero <sanne(a)hibernate.org>:
That's an interesting proposal, as index sharing inherently
implies
that fields on different types shall not have conflicting mapping
(i.e. don't reuse the same field name for a different type).
By default we don't share indexes across unrelated types, but also *by
default* subtypes are indexed in the same index as their parent - if
the parent is indexed as well.
Yes, I think that's the case where it makes sense. It'd make sense to
re-phrase the docs in that regard.
The reason is to efficiently map a polymorphic domain: when people
search for type X, they implicitly also search for its subtypes as
these are valid candidates for the query.
Having them all in the same index makes for better result quality and
better search performance - as joining multiple IndexReaders to
perform a cross - index Query is generally a bad idea, as it's then
hard to accurately normalize statistics across different vector
spaces, and that's what defines the quality of the search result.
At least I believe that *generally* that would give you better
results, but that's why we give options, and also why sometimes people
might want multiple Domain objects to be stored in the same index:
they might be "subtypes" from a domain perspective even if they don't
technically use inheritance at the Java level: they might be different
types and yet be mapped to some common fields with (hopefully)
compatible indexing options.
Have you ever seen this as an actual requirement by someone?
If we were to drop index sharing, then I think it should be fair to
also not support multiple types as target for a query anymore; as I'm
assuming in this case you'd only share for subtypes of some common
parent, and you'd target that common parent exclusively to perform a
polymorphic query.
Assuming we'd drop index sharing for unrelated types but would
continue to support it for the types of one inheritance hierarchy, one
still might want results only from a sub-set of the hierarchy's types.
So that's the reasons for which it exists; there are some good reasons
to not allow it too: as you mention the filtering, but also the very
fact that the type information has to be stored in form of classname
(typename, in free-form).
Interestingly, that's not so much an issue with ES. There you always
add a "type" discriminator.
I think the strongest reason to not allow it is to avoid the
inconsistent field mappings, but we could compensate for that with
better schema validation - something which seems is getting more
necessary anyway.
Yes, that' help. All in all, index sharing for inheritance hierarchies
makes sense to me, but I am doubtful about sharing between unrelated
types.
I didn't mean to kill the proposal :) just hoping it helps figure out
why someone might need it. Would be nice to think of alternatives out
of the box to avoid the filtering.
Sanne
--Gunnar
On 12 August 2015 at 15:30, Gunnar Morling <gunnar(a)hibernate.org> wrote:
> Hibernate Search aficionados,
>
> I am wondering what that's the rationale for offering the feature of
> index sharing [1].
>
> The ref guide says "there is really not much benefit in sharing
> indexes". It complicates queries, as an additional filter on the type
> field must be applied in case of targeting only one entity using a
> shared index.
>
> Should we consider to drop this feature in HS 6?
>
> Thanks,
>
> --Gunnar
>
> [1]
https://docs.jboss.org/hibernate/search/5.4/reference/en-US/html_single/#...
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/hibernate-dev
_______________________________________________
hibernate-dev mailing list
hibernate-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev