[hibernate-issues] [Hibernate-JIRA] Commented: (HSEARCH-886) Provide the ability to configure specific paths to index within @IndexEmbedded as an alternative to depth

Zach Kurey (JIRA) noreply at atlassian.com
Wed Feb 8 18:52:10 EST 2012


    [ https://hibernate.onjira.com/browse/HSEARCH-886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=45424#comment-45424 ] 

Zach Kurey commented on HSEARCH-886:
------------------------------------

I don't think it impacts the design that much if we were to go with something like my latest suggestion of deprecate @IndexEmbedded, and add new annotations @IndexEmbeddedByDepth and @IndexEmbeddedByPaths.  @IndexEmbeddedByPaths does seem a bit awkward at a glance this way(excludes is an attributes of the depth approach, and includes has its own annotation), but really you are talking about two different approaches.  The first auto index using depth, and then weed out problem paths.  The other explicitly index only certain paths.

Just to summarize my latest suggestion on the github thread(which may be a wild idea, but just trying to break the log jam).  I'm proposing
-  Deprecate @IndexEmbedded.  This is for two reasons.  
	1.  Its debatable(too subject to opinion) whether or not depth should be implicitly set to 0 under the covers if depth is NOT specified but 'includePaths' IS specified, or whether the default annotation value should be law and what is used at runtime, which is intuitive to some and unintuitive to others.  
	2.  The second is deprecation advertises to users of newer versions of hibernate search that there are alternatives to consider when indexing.
-  Add @IndexEmbeddedByDepth(depth=N).  Personally I think depth should be required to be specified.  Users can still achieve infinite depth by setting the value to Integer.MAX_VALUE.  But really, at least in my use cases, I've always wanted to carefully consider this value, and making it required shouldn't add that much time to prototyping.
-  Add @IndexEmbeddedByPaths(paths={}), which if used by itself provides a simple way to index some explicit set of paths.  If used in conjunction with @IndexEmbeddedByPaths, is useful if you want a few paths that exceed the depth you specified(but you don't want all paths to go past that depth).  

I know the community needs a way to explicitly include paths as an option for getting rid of unnecessary paths.  I'd really like to get that finalized soon since this is really dragging on.  Then maybe we can wait and see if there is a large desire to add excludePaths in the future and add it when that becomes more clear?  The suggestion I'm throwing out there would leave the door open to that(adding it as an attribute of @IndexEmbeddedByDepth in the future).

> Provide the ability to configure specific paths to index within @IndexEmbedded as an alternative to depth
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: HSEARCH-886
>                 URL: https://hibernate.onjira.com/browse/HSEARCH-886
>             Project: Hibernate Search
>          Issue Type: New Feature
>          Components: engine, mapping
>            Reporter: Zach Kurey
>            Assignee: Davide D'Alto
>
> Frequently its desirable to index a particular embedded type differently depending on the use case of the referencing type that is the primary subject being indexed.  Additionally, depth in general causes many more paths to be included in a document than necessary for a particular index.  This makes tuning of indexing to eliminate problem paths difficult, and sometimes impossible if a particular object model re-uses a lot of types.  
> The proposal/improvement has already been discussed more in depth here:  http://www.mail-archive.com/hibernate-dev@lists.jboss.org/msg06548.html, and what follows reflects some of that discussion.  
> As an example of how specific paths could be configured for indexing:
> @Indexed
> class A{
>    @IndexEmbedded(
>        depth=0,
>        @IndexPaths(paths={"d.one", "d.two"})
>     )
>    private C see;
> }
> @Indexed
> class B{
>    @IndexEmbedded(
>        depth=0,
>        @IndexPaths(paths={"foo"})
>    )
>    private C see;
> }
> class C{
>    @IndexEmbedded
>    private Collection<D> d;
>    @Field
>    private int foo;
> }
> class D{
>    @Field
>    int one;
>    @Field
>    int two;
> }
> Index A would contain:  d.one, and d.two
> Index B would contain:  foo, but would NOT contain anything from path 'd'.
> Perhaps indexing path 'd' has a performance impact that is desirable to eliminate for B, but acceptable or necessary for A.  This ability would also help to eliminate the bloat of unnecessary fields in lucene documents; which may not itself be a performance problem, but leaves a lot of things to rule out when tracking down indexing issues(both performance or content).
> Lastly.  To be clear, the above proposal(which really Sanne came up with in the email thread) does not conflict with depth.  Here are some further examples of how depth may interact with explicit paths:
> @IndexEmbedded(depth=3, paths={"a.b.c.d.e"})
> Says to index all paths up to depth 3, but additionally index path 'a.b.c.d.e'.
> @IndexEmbedded(depth=0, paths={"a.b.c.d.e"})
> Says to only index path 'a.b.c.d.e'
> @IndexEmbedded( paths={"a.b.c.d.e"})
> Default behavior, depth is unlimited, specifying a.b.c.d.e is redundant in this case.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the hibernate-issues mailing list