On Thu, 25 Aug 2011 17:47:30 +0200, Zach Kurey <pushedbytime(a)gmail.com>
wrote:
On Aug 25, 2011, at 2:28 AM, Hardy Ferentschik wrote:
> Or just 'include' and 'exclude'.
> I feel we are becoming overly verbose in the API design (which is
> besides
> this
> particular issue)
That doesn't seem clear to me. If its just include/exclude, what
exactly is being included/excluded?
I think there was never a question of what exactly is included or excluded.
The question is whether the property name of the property the annotation
is defined on
should be part of the path. The advantage of adding it is that it reflects
the
actual field name being indexed, but I can also see the motivation not to
add it.
> My experience that is is not a good idea to change implicitly
> ignore default values. In this case I would rather see an additional
> enum parameter which allows the user to explicitly select the embedding
> mode (BY_DEPTH, BY_PATH) or maybe introduce a new annotation
> @IndexSubPath
The mode option is confusing if the exclude option is also in the mix,
because 'exclude' applies when when the default depth approach is used.
With separate annotations how do you envision supporting the exclude
option? Would it be like this set of options:
Yes, one of these alternatives. Maybe the "one new 'exclude' attribute in
IndexEmbedded and one new annotation for IndexSubPaths"
If 'includeSubPaths' is specified as an attribute of
IndexEmbedded, it
seems really clear to me that depth no longer applies since you are
specifying specific paths that clearly have a depth.
It still conflicts with the actual default value of depth and that there is
no explicit way to say that it is ignored. I don't think "it seems clear"
is a good enough reason.
But regardless, any of the approaches above would achieve the same
end.
Exactly.
It's not the size of the resulting index that has been
problematic.
It's the expense to re-write a large document by having to recursively
traverse down 'depth' at every IndexEmbedded property. It adds to DB
load. It adds CPU overhead on whatever system is doing the
indexing(which can be very high when there are a lot of unneeded fields
getting put into the documents), and it can significantly slow down
large transactions(even in async mode since that consumes CPU and other
resources). If the CPU load gets significant enough then the user apps
have to offload onto other servers to do the indexing and copying slave
copies of the indexes around a cluster, which adds to operational
complexity. I could go on. = )
I'd just prefer that Hibernate Search be as light weight as possible,
and thats hard to do with depth + a complex object model.
I get that and obviously you want to fine tune this as much as possible.
Just wanted to make sure you do it for all the right reasons :-)
--Hardy