[hibernate-dev] HSEARCH-471 Ability to selectively index an entity based on its state

Sanne Grinovero sanne at hibernate.org
Tue Feb 1 12:16:54 EST 2011

Reviving again;
I think we should split the solution among two main use cases:

1)entities which should stay out of the index for security/filtering reasons
In this case we need to accomodate on-commit deletion of entities which where
indexed, and are no more. We need to invoke the check for "needs
indexing" on the
entity as it was during loading and again at commit to see if changes
are needed.

2)entities people would like to skip index operations because they
don't care about consistency
of this specific index, but want to improve indexing performance.
In this case "@SkipIndexing" is a more suitable name, and if the check
returns "we have to delete this",
we might actually skip the delete operation.

I don't like option 2), it's not clear what to do for example when the
entity is updated and the
@SkipIndexing-annotated-method returns true, while for example it was
false before the update -> a mess.
Also it seems likely that even while performing some additional
deletes, it still is a performance improvement.

So focusing on 1) only ATM, are we able at flush/commit time to invoke
the annotated method twice, using once the original entity, in the
state it was at transaction begin?
Or shall we mandate an additional persistent attribute, so we can
basically compare it as we do with dirty checking?


2011/1/14 Sanne Grinovero <sanne.grinovero at gmail.com>:
> 2011/1/14 Hardy Ferentschik <hibernate at ferentschik.de>:
>> On Fri, 14 Jan 2011 14:24:07 +0100, Emmanuel Bernard
>> <emmanuel at hibernate.org> wrote:
>>>>> * SearchFactory.enableIndexing(boolean) // enables/disables globally on
>>>>> the fly
>>>>> could operate on existing boolean
>>>>> org.hibernate.search.event.FullTextIndexEventListener.used
>>>> +1 This is something I was wishing for some time back. It should be easy
>>>> to implement as well. This option should then also be exposed via JMX.
>>> What's the use case behind it?
>> I think it would allow for application which are completely manageable via
>> some user interface. At the moment you set the auto indexing option or not.
>> If I could programmatically change it I could do things like stop
>> auto indexing, do db modifications, purge the index, re-index, enable
>> event indexing.
>>>>> * entity controlled, via an annotated boolean property
>>>>> I'd mandate a boolean property, or something which can be easily
>>>>> evaluated via a SQL fragment / Criteria / filterable, so that the same
>>>>> information could be reused
>>>>> by the MassIndexer when picking all values to be indexed.
>>>> Not sure I understand exactly what you mean.
>>> @Entity @Indexed
>>> public class MyEntity {
>>>   ...
>>>   @SkipIndexing
>>>   boolean isIndexed() {
>>>      return status == TEMP;
>>>   }
>>> }
>> Wouldn't the other solution be more generic?
> yes, but the more flexibility we give, the less clever things we can
> do automatically.
> Consider the dirty checks we implemented for HSEARCH-361, I had to
> disable every optimization
> both in case there's a ClassBridge or a BoostStrategy as in those
> cases we can't predict
> if the index would change.
> Similarly, in this case if we allow for a flexible custom made
> implementation we should
> disable dirty checks on such entities and would have no clue to
> provide to the MassIndexer,
> at least not without more help from the user;
> we'd need to consider at least those cases, something along:
> void addRestrictions(Criteria loadingCriteria);
> Set<String> getConsideredPropertyNames();
> Who knows what else we'll need later.
> I'd say @SkipIndexing is more straight-forward, but we can also
> support the more flexible approach
> as an "advanced" alternative which requires some more coding and care.
> Sanne

More information about the hibernate-dev mailing list