Thanks for the feedback Sanne!
On Mon, 28 Feb 2011 19:56:59 +0100, Sanne Grinovero <sanne(a)hibernate.org>
wrote:
I've been reading this focusing on the tests and the Collectors
implementation.
That's the most interesting part. There are a few unrelated cleanup and
testing
related commits which as you say I should probably squash. I probably
should go through
the commits tomorrow and sort things out a little. Maybe we could even
pull in
this initial cut to have a common ground to work from again.
Comments:
1) why is it named SimpleFacetRequest, are you planning for a more
advanced one?
SimpleFacetRequest is probably a bad name. Maybe ValueFacetRequest would
be better.
It is basically just asking for the different field values and their count
for the faceted
field. In contrast RangeFacetRequest groups counts into numeric ranges. As
mentioned
another subclass could be DateRangeFacetRequest
2) I couldn't find a test doing a faceting on more than one
field,
still it looks like the Collector and FieldCacheContainer
are prepared to deal with that.
It depends what you mean. I think there should be a test with multiple
in-depended
facets, eg facet on engine size and color in the car example.
There is also the use case pivoting where you set multiple fields into
relation to
each other. For example you facet on engine size and within that you facet
further on color.
I haven't implemented this usecase yet. Not sure if we should add this
feature right away.
while working on the FieldCache support I also started with similar
Map<Field, containers>, but then I realized that
the number of fields I'd work on is definitely limited, so I removed
all Map lookups (and all puts and foreach's) to support
only a single field per collector instance, so you can specialize the
single instance and eventually chain them up when
more than a field is being requested.
Hmm, that's an interesting idea. Instead of using maps you basically have
one collector per
facet request. This idea is definitely worth exploring. It might even
clean out some of the code.
Do you have any experience what this really brings runtime wise? But I
will definitely explore your idea.
In case you support faceting on multiple facets/fields, how are you
going to specify the sort order?
I'm not suggesting you should, I'd rather remove support for it.
Not sure whether you are talking about pivoting here. In the case you are
using multiple in-dependent
facet requests, each facet result is ordered by itself. I think having a
sort order is important, at least
you should be able to either sort by count or by value
Are you relying on fieldCache only, or do you have an alternative way
too? I'm just wondering if alternatives are possible, as fieldCaches seem
very expensive memory-wise.
Yes, the current implementation is based on field caches. As you say, the
main concern is memory in this cases.
There is an alternative I haven't explored yet working with TermEnum and
doc sets intersections.
Wouldn't it make sense to have a
fullTextSession.createFacetingQuery( LuceneQuery, facetName, targetTypes
);
?
At least people won't need to cast the return type to your special
container.
Not sure if I follow here. Faceting is for me more like filtering. You en-
and disable
facets before running the query. You still need the main result list, but
also a way to access the
facet results.
> Map<String, FacetResult> results =
query.getFacetResults();
> FacetResult facetResult = results.get( "foo" );
> List<Facet> facetList = facetResult.getFacets();
> assertEquals( "Wrong facet count for facet ", 100, facetList.get( 0
> ).getCount() );
Looks great.
Would it be possible (in future maybe) to return managed entities /
selected projection instead of their frequency only?
That's an awesome idea. This feature could set our faceting apart from the
just count based approach
I don't think we should do that for next Alpha, but it would be
great
to have lazily loaded proxies of the contents of each facet, as
usability I'd expect people to "expand" on the facet to show the
actual results,
which are likely going to be the usual entities, or some projection.
One thing I want to add is the ability to pass in the current query into a
Facet and get returned a BooleanQuery
which combines the two.
Nice.
Do you think we need the user to bother for the field type? don't you
have it on the FieldBridge?
I'm wondering, as I'm stuck at this point with my own patches :)
Right. It's on my todo list. In fact I am missing at the moment the meta
data api
we have been talking about to "reflect" on the field configuration. I need
to check what is possible atm.
> Would we keep the programmatic configuration as a public API?
> * I made the FacetRequest classes immutable atm, but this way I have a
> multitude of constructors catering for a whole range of parameters
> (sort order, include zero counts, ...). Any opinions around immutable
> objects vs objects with setters for configuring options after creation.
I usually favour immutable objects only when creating them is (very)
expensive and so you'd like to reuse them in other threads having
different parameters.
Where you thinking of a FacetRequest
as something "define once, run multiple times", like a named query?
right. I think I lean towards mutable request objects in this case. I
wanted to see where thigs
are going with this immutable request object, but I think it is not worth
it in this case
I'd give priority to the performance of the Collector, if you
have to
choose.
Sure
Thanks again,
--Hardy