On Aug 25, 2011, at 2:28 AM, Hardy Ferentschik wrote:
Or just 'include' and 'exclude'.
I feel we are becoming overly verbose in the API design (which is besides
this
particular issue)
That doesn't seem clear to me. If its just include/exclude, what exactly is being
included/excluded? I think 'paths' has to be part of it, which is basically where
we started. But there is a desire to have exclude vs. include, so that became
includePaths, excludePaths. But then it seemed ambiguous if the user needed to specify
the attribute name itself that the annotation is on(in the examples 'see'), so
'sub' paths was added to make it clearer. But includePaths/excludePaths seems
fine as well, since again I think its clear the property the annotation is declared on
doesn't need to be repeated in the path.
@IndexEmbedded(includePaths={"see.a.b.c"}) // redundant, prefer
includePaths={"a.b.c"}
private SomeType see;
My experience that is is not a good idea to change implicitly
ignore default values. In this case I would rather see an additional
enum parameter which allows the user to explicitly select the embedding
mode (BY_DEPTH, BY_PATH) or maybe introduce a new annotation @IndexSubPath
The mode option is confusing if the exclude option is also in the mix, because
'exclude' applies when when the default depth approach is used. With separate
annotations how do you envision supporting the exclude option? Would it be like this set
of options:
Two new annotations:
@IndexEmbedded(depth = N)
@ExcludePaths(paths={...})
private SomeType someProperty;
or
@IndexEmbedded(depth = N)
private SomeType someProperty;
or
@IndexSubPath(paths={...})
private SomeType someProperty
Or would it be one new 'exclude' attribute in IndexEmbedded and one new annotation
for IndexSubPaths:
@IndexEmbedded(depth = N, excludePaths={...})
private SomeType someProperty;
or
@IndexEmbedded(depth = N)
private SomeType someProperty;
or
@IndexSubPaths(paths={...})
private SomeType someProperty
If 'includeSubPaths' is specified as an attribute of IndexEmbedded, it seems
really clear to me that depth no longer applies since you are specifying specific paths
that clearly have a depth. I also think its cleaner, than the combination of attributes +
annotations above. But regardless, any of the approaches above would achieve the same
end. Maybe splitting off into another annotation gives you more configuration flexibility
in the future for dealing with specific paths that I can't think of at the moment.
One thing worth mentioning is that generally the index size is not a
big
problem
It's not the size of the resulting index that has been problematic. It's the
expense to re-write a large document by having to recursively traverse down
'depth' at every IndexEmbedded property. It adds to DB load. It adds CPU
overhead on whatever system is doing the indexing(which can be very high when there are a
lot of unneeded fields getting put into the documents), and it can significantly slow down
large transactions(even in async mode since that consumes CPU and other resources). If
the CPU load gets significant enough then the user apps have to offload onto other servers
to do the indexing and copying slave copies of the indexes around a cluster, which adds to
operational complexity. I could go on. = )
I'd just prefer that Hibernate Search be as light weight as possible, and thats hard
to do with depth + a complex object model.
and we used to say that from a query point of view you have more
flexibility
if all properties are indexed (compared to limiting yourself already at
index
time to what you can search).
For more dynamic search apps, I get that perspective. But in our case we know at app dev
time what fields we need to be indexed to satisfy our searches. Requirements on what we
need to search on don't change post a particular release, and certainly not at
runtime.
Zach