[hibernate-dev] [HSEARCH] Usefulness of index sharing

Sanne Grinovero sanne at hibernate.org
Wed Aug 12 11:46:12 EDT 2015


That's an interesting proposal, as index sharing inherently implies
that fields on different types shall not have conflicting mapping
(i.e. don't reuse the same field name for a different type).

By default we don't share indexes across unrelated types, but also *by
default* subtypes are indexed in the same index as their parent - if
the parent is indexed as well.

The reason is to efficiently map a polymorphic domain: when people
search for type X, they implicitly also search for its subtypes as
these are valid candidates for the query.
Having them all in the same index makes for better result quality and
better search performance - as joining multiple IndexReaders to
perform a cross - index Query is generally a bad idea, as it's then
hard to accurately normalize statistics across different vector
spaces, and that's what defines the quality of the search result.
At least I believe that *generally* that would give you better
results, but that's why we give options, and also why sometimes people
might want multiple Domain objects to be stored in the same index:
they might be "subtypes" from a domain perspective even if they don't
technically use inheritance at the Java level: they might be different
types and yet be mapped to some common fields with (hopefully)
compatible indexing options.

If we were to drop index sharing, then I think it should be fair to
also not support multiple types as target for a query anymore; as I'm
assuming in this case you'd only share for subtypes of some common
parent, and you'd target that common parent exclusively to perform a
polymorphic query.

So that's the reasons for which it exists; there are some good reasons
to not allow it too: as you mention the filtering, but also the very
fact that the type information has to be stored in form of classname
(typename, in free-form).
I think the strongest reason to not allow it is to avoid the
inconsistent field mappings, but we could compensate for that with
better schema validation - something which seems is getting more
necessary anyway.

I didn't mean to kill the proposal :) just hoping it helps figure out
why someone might need it. Would be nice to think of alternatives out
of the box to avoid the filtering.

Sanne



On 12 August 2015 at 15:30, Gunnar Morling <gunnar at hibernate.org> wrote:
> Hibernate Search aficionados,
>
> I am wondering what that's the rationale for offering the feature of
> index sharing [1].
>
> The ref guide says "there is really not much benefit in sharing
> indexes". It complicates queries, as an additional filter on the type
> field must be applied in case of targeting only one entity using a
> shared index.
>
> Should we consider to drop this feature in HS 6?
>
> Thanks,
>
> --Gunnar
>
> [1] https://docs.jboss.org/hibernate/search/5.4/reference/en-US/html_single/#section-sharing-indexes
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev


More information about the hibernate-dev mailing list