[hibernate-dev] [HSearch] Faceting feedback

Sanne Grinovero sanne at hibernate.org
Thu Mar 10 12:48:33 EST 2011


2011/3/1 Emmanuel Bernard <emmanuel at hibernate.org>:
> Hi Hardy,
> Nice job.
>
> Here are a few comments in random order:
>
> To avoid the problem of Constructor multiplications and still use immutable objects use a builder to collect the information and create the object out of it. The constructor can even be package private
>
> My initial reaction was that facet would be an awesome declarative feature like analyzers at least for simple use case (annotations + programmatic mapping API). So +1 for the feature. It seems though that some people would want a very dynamic way to configure their facet so keeping a programmatic API seems to make sense as well. I am not a expert in faceting so feel free to correct me. If we are not sure, let's start with the pure declarative approach and expand to the programmatic API later.
>
> In tests, we should try and use the Query DSL instead of raw lucene queries: we try to convert people to this nicer approach and it shows us how well or abd the rest of the API is when integrated.
>
> To be consistent, SimpleFacetRequest should be named DiscreteFacetRequest to be symmetric with RangeFacetRequest.
> Sanne was mentioning the idea of lazy results. Be careful if this laziness means keeping some resources open, we usually don't do that (except when using scroll).

I don't mean to keep the index resources open, you can get all
EntityInfos you need eagerly, build the faceting map, and close the
indexSearcher, but then the map implementation you return can load the
entities contained in the values on demand instead of loading all of
them at once.

Also about the default return type of a faceting query, I'd see it
very close the the standard query:
the list() API on Session returns a List of managed entities, the
faceting API returns a Map of managed entities.

In alternative, as the projection options might be enabled, same as
the list() API returns a list of arrays, the faceting returns a Map
having each key one of the requested facets, and each value a list of
arrays (same as list())

So in my opinion the default behaviour of a faceting Query should be
to return managed entities as usual, just being organized in a Map
structure instead than the usual List.

Cheers,
Sanne

>
> An integration with the QueryDSL would be awesome and a much nicer programmatic API. It's a shame e can't add payload to lucene queries to integrate that further.
>
> FacetRequest request =
> carBuilder
>  .facet()
>    .named("prices")
>    .range() //or should it be range(Integer.class)
>    .onField("price")
>    .from(0).to(1000).excludeLimit()
>    .from(1000).to(1500).excludeLimit()
>    .above(1500)
>    .orderedBy(FIELD_VALUE);
>
> query.enableFacet(request);
>
> something like that.
>
> BTW no relation to facet but, we could change QueryBuilder to QueryBuilder<T> (as in QueryBuilder<Car>). We don't need the info but that might help people to avoid mixing queryBuilders and query results by accident.
>
> I'd put the name in the request object and add remove 'query' in enableQueryFacet.
>
> I am getting a bit worried about these APIs that are used under certain conditions only like getFacetResults(). I wonder if we could do something nicer. I have no real solution, maybe via composition or via the returned object of enableFacet?
>
> That's all for now :)
>
> On 28 févr. 2011, at 16:13, Hardy Ferentschik wrote:
>
>>
>> Hi,
>>
>> I thought it would be great to get some feedback on my faceting work.
>> You can see the latest on my Search fork -
>> https://github.com/hferentschik/hibernate-search/commits/HSEARCH-667
>>
>> Technically I decided to use a simple custom Collector. I abandoned the
>> idea for using bobo browse, since it did
>> not really seem to fit our architecture and I am not sure how well
>> maintained the code is.
>> Within the custom Collector I am using Lucene's FieldCache to cache and
>> collect the count values during facting
>> (as a reminder, faceting for example means that I am searching for all
>> cars or a certain make, but then also want
>> to group the matching cars into their cubic capacity. See also -
>> http://en.wikipedia.org/wiki/Faceted_search)
>> Using the FieldCache is quite memory costly, but there are other ways to
>> implement the faceting itself.
>>
>> At the moment I am mostly interested in the feedback around the public
>> API. The public classes can be found in
>> the package org.hibernate.search.query.facet -
>> https://github.com/hferentschik/hibernate-search/tree/3a9877e2bbc47a8bd6e438dbbab56d93ac9101fd/hibernate-search/src/main/java/org/hibernate/search/query/facet
>>
>> The idea is to write a fulltext query as usual and then add/enable a facet:
>>
>> FacetRequest request = new SimpleFacetRequest( indexFieldName,
>> FacetSortOrder.COUNT_DESC, false );
>> TermQuery term = new TermQuery( new Term( "make", "honda" ) );
>> FullTextQuery query = fullTextSession.createFullTextQuery( term, Car.class
>> );
>> query.enableQueryFacet( "foo", request );
>>
>> Then you run the query. This will enable the facet collector and after the
>> query executed you get access to a map
>> which maps FacetResults to the facet name. Each FacetResult contains a
>> list of Facets which contain the actually
>> field values and counts:
>>
>> Map<String, FacetResult> results = query.getFacetResults();
>> FacetResult facetResult = results.get( "foo" );
>> List<Facet> facetList = facetResult.getFacets();
>> assertEquals( "Wrong facet count for facet ", 100, facetList.get( 0
>> ).getCount() );
>>
>> More actual tests can be found here -
>> https://github.com/hferentschik/hibernate-search/tree/3a9877e2bbc47a8bd6e438dbbab56d93ac9101fd/hibernate-search/src/test/java/org/hibernate/search/test/query/facet
>>
>> At the moment you are able to facet on simple (string) based values or on
>> number ranges (eg price ranges 0 - 100, ...). For that I have created
>> subclasses of
>> FacetRequest - SimpleFacetRequest and RangeFacetRequest (a
>> DateRangeFacetRequest might be interesting as well)
>>
>> Some concrete questions:
>> * Atm, I am only exposing a programmatic API for creating FacetRequests. I
>> guess we want to have annotations for this as well, right?
>>   Would we keep the programmatic configuration as a public API?
>> * I made the FacetRequest classes immutable atm, but this way I have a
>> multitude of constructors catering for a whole range of parameters
>>   (sort order, include zero counts, ...). Any opinions around immutable
>> objects vs objects with setters for configuring options after creation.
>>
>> If course I am interested in any other feedback as well.
>>
>> --Hardy
>>
>>
>>
>>
>> _______________________________________________
>> hibernate-dev mailing list
>> hibernate-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>
>
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>




More information about the hibernate-dev mailing list