On Aug 24, 2011, at 8:26 AM, Sanne Grinovero wrote:
This complicates things. First of all it means that the
"subPaths"
property should now be named "includeSubPaths" instead, as opposing to
"excludeSubPaths".
Yes, if 'excludeSubPaths' is provided, then 'subPaths' should be renamed
to 'includeSubPaths', for cleanliness/symmetry sake.
Also with such names I would expect the additional
paths to work *in addition to* normal depth.
I think wasn't exact enough. I would expect 'includeSubPaths' to be
incompatible with both 'depth' and 'excludeSubPaths'. However, I would
expect 'depth' and 'excludeSubPaths' to be compatible. Which basically
says to index using the default approach, and only stop at max depth, but exclude indexing
of the paths specified.
given:
class C{
@IndexEmbedded
private Collection<D> d;
@Field
private int foo;
}
Illegal configuration: can't specify depth and includeSubPaths simultaneously:
class A{
@IndexEmbedded(
includeSubPaths={"d.one", "d.two"}, depth=5
)
private C see;
}
Illegal configuration: specifying includeSubPaths and excludeSubPaths is nonsense, since
absence of specifying in includeSubPaths means the path won't be indexed anyway:
class A{
@IndexEmbedded(
includeSubPaths={"d.one", "d.two"}, excludeSubPaths
{"d.three"}
)
private C see;
}
Valid configuration: Excludes indexing of d. Maybe D leads to cycles, or expensive
nested joins, and it isn't used when searching index A, so we want to exclude it.
class A{
@IndexEmbedded( depth=5, excludeSubPaths {"d"} )
private C see;
}
Also what validly constitutes a path is different for excludeSubPaths. Anywhere in a
'path' can be a termination point where the user can express that they don't
want indexing to go down that path any further; and that could potentially go down to a
leaf. While 'includeSubPaths' must be composed of leaf nodes.
So to implement your original suggestion we should have thought of a
mapping algorithm which would use either the _depth_ approach or the
_subPaths_ approach, but you say that in practice you would apply them
both?
In this case if I wanted to use the subPaths strategy only I should
use depth=0 and then add what I want to add? Just checking if we're on
the same page.
No, that wasn't what I meant. I'd expect the annotation processing to basically
look like:
IndexEmbedded embeddedConfig = (IndexEmbedded) node.getAnnotation(IndexEmbedded.class);
if(embeddedConfig.includeSubPaths() != null
&& embeddedConfig.depth() != null || embeddedConfig.excludeSubPaths() != null){
throw new IllegalArgumentException("Invalid configuration: Cannot specify
includeSubPaths and depth(nor excludeSubPaths), simultaneously");
}
Hopefully it would be understood that if only includeSubPaths is provided, then the
default depth is irrelevant and is explicitly expressed per path.
Do you have a great example to support the more complex option? We
have to start somewhere, but the property names should be final and
the meaning should not change in future if we then want to add the
exclusions in future.
I think the complex option you thought I was implying was a mixed bag approach. Which
I'm not advocating for. My only purpose for suggesting the 'exclude' option
is that if I have 100 properties I want to index for a particular entity, then listing 100
properties explicitly in 'includeSubPaths' could be laborious(and some might think
messy). Those 100 properties could be directly on the entity, or they could be through
associated entities. However, because of my desire to have those 100 properties, because
of 'depth' I might end up with 1000 values indexed(mostly waste and potentially
costly). In that case maybe those 900 other values come from a particular unused path, or
a path I can prune a bit through via 'excludeSubPaths'.
Overall I think the options of: default approach, default + excludesSubPaths, or
includeSubPaths(but no default depth or excludes), gives users 3 good options for how they
want to go about indexing, and they can choose the least painful approach for their
particular use case. Most cases are going to be simple and for a particular entity only a
very limited subset of properties are needed for search, and I'd probably go with
'includeSubPaths' most of the time in our particular object model.
Hope that clarifies things?
Zach