[hibernate-issues] [Hibernate-JIRA] Commented: (HSEARCH-471) Ability to selectively index an entity based on its state

Dobes Vandermeer (JIRA) noreply at atlassian.com
Fri Apr 9 12:12:58 EDT 2010


    [ http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=36343#action_36343 ] 

Dobes Vandermeer commented on HSEARCH-471:
------------------------------------------


That is certainly a good question!

It looks like about 30% of the 700,000 records in our largest index with this feature are marked as "removed".  Assuming that lucene performance isn't really linear to the number of records searched, I suppose that the performance gain of a typical search might be considerably less than 30%.

I'm not sure of the impact of additional search terms on lucene performance, however; given that a typical query has only one term adding the extra removed:false search may or may not have a noticeable performance impact.

This also impacts memory usage, as the norms array includes one byte for every record in the index.  Now the index and the norms cache would be smaller and use less memory for the OS file cache and lucene's norms cache.

However, I wouldn't implement this suggestion just for me, maybe wait and see if you get some other votes.

Lacking this feature and the ability to shard indexes based on a foreign key (another feature request) I've dropped hibernate-search to do manual lucene calls so I can do both of these optimizations myself to get the index sizes way down (the foreign-key feature is really the key part of the decision as search was using too much memory).



> Ability to selectively index an entity based on its state
> ---------------------------------------------------------
>
>                 Key: HSEARCH-471
>                 URL: http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-471
>             Project: Hibernate Search
>          Issue Type: New Feature
>          Components: mapping
>    Affects Versions: 3.1.1.GA
>            Reporter: Dobes Vandermeer
>            Priority: Minor
>
> In our system we have entities that are searched but not all of them are available for search - some of them are flagged as "removed".  It would improve the efficiency of our search subsystem if we could implement a kind of "filter" that blocked these entities from being added to the search index, since we wouldn't have to make that a search term and our indexes would be somewhat smaller.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://opensource.atlassian.com/projects/hibernate/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the hibernate-issues mailing list