[hibernate-dev] [HSEARCH] Usefulness of index sharing

Gunnar Morling gunnar at hibernate.org
Thu Aug 13 03:33:38 EDT 2015


Hi,

2015-08-12 17:46 GMT+02:00 Sanne Grinovero <sanne at hibernate.org>:
> That's an interesting proposal, as index sharing inherently implies
> that fields on different types shall not have conflicting mapping
> (i.e. don't reuse the same field name for a different type).
>
> By default we don't share indexes across unrelated types, but also *by
> default* subtypes are indexed in the same index as their parent - if
> the parent is indexed as well.

Yes, I think that's the case where it makes sense. It'd make sense to
re-phrase the docs in that regard.

>
> The reason is to efficiently map a polymorphic domain: when people
> search for type X, they implicitly also search for its subtypes as
> these are valid candidates for the query.
> Having them all in the same index makes for better result quality and
> better search performance - as joining multiple IndexReaders to
> perform a cross - index Query is generally a bad idea, as it's then
> hard to accurately normalize statistics across different vector
> spaces, and that's what defines the quality of the search result.
> At least I believe that *generally* that would give you better
> results, but that's why we give options, and also why sometimes people
> might want multiple Domain objects to be stored in the same index:
> they might be "subtypes" from a domain perspective even if they don't
> technically use inheritance at the Java level: they might be different
> types and yet be mapped to some common fields with (hopefully)
> compatible indexing options.

Have you ever seen this as an actual requirement by someone?

>
> If we were to drop index sharing, then I think it should be fair to
> also not support multiple types as target for a query anymore; as I'm
> assuming in this case you'd only share for subtypes of some common
> parent, and you'd target that common parent exclusively to perform a
> polymorphic query.

Assuming we'd drop index sharing for unrelated types but would
continue to support it for the types of one inheritance hierarchy, one
still might want results only from a sub-set of the hierarchy's types.

>
> So that's the reasons for which it exists; there are some good reasons
> to not allow it too: as you mention the filtering, but also the very
> fact that the type information has to be stored in form of classname
> (typename, in free-form).

Interestingly, that's not so much an issue with ES. There you always
add a "type" discriminator.

> I think the strongest reason to not allow it is to avoid the
> inconsistent field mappings, but we could compensate for that with
> better schema validation - something which seems is getting more
> necessary anyway.

Yes, that' help. All in all, index sharing for inheritance hierarchies
makes sense to me, but I am doubtful about sharing between unrelated
types.

>
> I didn't mean to kill the proposal :) just hoping it helps figure out
> why someone might need it. Would be nice to think of alternatives out
> of the box to avoid the filtering.
>
> Sanne

--Gunnar

>
>
>
> On 12 August 2015 at 15:30, Gunnar Morling <gunnar at hibernate.org> wrote:
>> Hibernate Search aficionados,
>>
>> I am wondering what that's the rationale for offering the feature of
>> index sharing [1].
>>
>> The ref guide says "there is really not much benefit in sharing
>> indexes". It complicates queries, as an additional filter on the type
>> field must be applied in case of targeting only one entity using a
>> shared index.
>>
>> Should we consider to drop this feature in HS 6?
>>
>> Thanks,
>>
>> --Gunnar
>>
>> [1] https://docs.jboss.org/hibernate/search/5.4/reference/en-US/html_single/#section-sharing-indexes
>> _______________________________________________
>> hibernate-dev mailing list
>> hibernate-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev


More information about the hibernate-dev mailing list