Without realizing how Lucene works as a developer using Hibernate Search you may not be immediately aware that every update to an indexed entity requires a full delete/create of the resulting Lucene document. For complex entities with lots IndexedEmbedded or ContainedIn relations, this can lead to a flurry of database reads as the indexed object structure is reinitialized.
In order to prevent this, the developer has to avoid certain "traps" in creating his index structure. We should describe these traps and suggest ways around them.
Some of the traps I experienced:
- Storing a changeable selection attribute on a OneToMany relation with a lot of potential children.
e.g.
@Entity
DiscussionGroup
@Field(index=Index.YES)
boolean closed;
@OneToMany
@ContainedIn
List<Post> posts;
@Entity
Post
@ManyToOne
@IndexedEmbedded(includePaths={"id","closed"})
DiscussionGroup group;
The solution to this is using the closed attribute in a filter rather than storing it on the Post.
- Using Filters against anything other than the current index is not described in the documentation. Generally the documentation for Filters for both Lucene and Hsearch tends to be oriented on the idea of using filters against the same index as that which is the basis for the query.
In the example above, the documentation could give an example of how to use a filter to exclude posts from closed groups in a performance efficient way. I know I'm looking for this!
- Another smart thing that is already described but not in this context, is to use includePaths as a way not to "touch" attributes that you don't need and therefore avoid unwanted initialization
- @ClassBridge always leads to initialization and smart use of fieldbridges and/or includePaths can often accomplish the same thing.
- @Field that are marked as @Transient are considered dirty by default and will always trigger reinialization
|