On Fri, 12 Aug 2011 00:02:09 +0200, Sanne Grinovero <sanne(a)hibernate.org>
I just read the document (nice doc! where did you find it?)
It was attached to the original Lucene issue related to faceting.
The commit requirements of this taxonomy index look like a mess, and
it also concerns me that it's totally impossible to remove stuff.
Yeah, there are quite some rules around when to commit in relation to
the main index writer. Good that there is Hibernate Search which can
handle this for the user :-)
Personally I am surprised that they introduced this new taxonomy index.
Funny enough the actual indexed Documents also contain category (faceting)
information. Hence also the need for the DocumentBuilder. I am sure that
there are good reasons to introduce this new index, but I am surprised
Yes generally the architecture supports it (as far as how we linked
all components), but both the backend and the ReaderProvider would
need a custom implementation; while it looks like the ReaderProvider
needs an additional API method, I think we can avoid it on the
I want to expose as little as possible of the underlying Lucene
For power users we might want to offer some way to access the
TaxonomyIndex/Reader directly. Not sure yet.
We will also need to extend on the annotation side. Our approach allows to
facet on any un-tokenized field. In the Lucene case we need to know for
fields we have to create faceting information. We could do this with an
optional parameter to @Field or we introduce a new @Faceted (or something
annotation. Obviously the Lucene goes a step further with category path
current faceting approach, but we don't have to extend our faceting DSL
Also, so you know what kind of data structure expect TaxonomyWriter
and TaxonomyReader? we'll need clustering for that too, hopefully it's
similar to a Map, or reuses the Directory API.
For clustering purposes I think we have to look at CategoryPath and how to
serialize it. It should be just a bunch of strings, but I haven't seen the
It would have been nice to get this stuff into Search 4 as well, but of
depends on when the next version of Lucene (either 3.4 or 4) would be
A Hibernate Search 4 bundled w/ Hibernate Core 4 and Lucene 4 would have
cool, but I don't think the timing will work out :-)
2011/8/11 Hardy Ferentschik <hibernate(a)ferentschik.de>:
> I was just reading the docs for the new Lucene faceting which makes use
> of a new index called taxonomy index. If we are going to use Lucene
> capabilities we have to make sure we can plug this into our current
> Reading the docs I can see quite some similarities between our
> terminology and theirs. That's good. However, the Lucene approach takes
> it much further.
> We might get a new candidate for serialization as well - CategoryPath.
> I uploaded the faceting API documentation to our shared dropbox
> directory. Have a look in case you are interested.