Hi Guillaume,
Awesome, great to see you making progress here! I need to take a
closer look, but some quick comments inline.
@Sanne, this reminds me of that pending PR of mine; Any chance you
spend some cycles reviewing and merging it? Guillaume's work is based
on it and it'll also serve as base of my own future work. Thanks!
2016-02-04 16:37 GMT+01:00 Guillaume Smet <guillaume.smet(a)gmail.com>:
Hi,
So the good news: most of the faceting tests now pass with ES.
I rebased the patch onto master.
https://github.com/gsmet/hibernate-search/commits/elastic-facet2
and for now only one commit:
https://github.com/gsmet/hibernate-search/commit/657ecfe684e6b4e952d11274...
It's not a PR yet as it's not feature complete and there are still (a few)
tests failing.
General infrastructure
================
- I introduced a JsonBuilder to make the gson API chainable: I think it's
more readable this way. I haven't updated everything to use it (especially
the Lucene Query -> Json part is not ported yet). Should I continue with
it? Please, please, say yes :).
It's a great example of how subjective readability is, because I don't
find it necessarily better than the previous version, it's just
different :) An actual disadvantage is the higher number of object
allocations due to the builders.
In the end I don't really care about it that much; how the JSON is
created is more a technical detail which we may change at any time. So
if you feel strongly about it, feel free to do it. But I would not
expose this builder and the usage of GSON through the public API so we
actually have the freedom of future changes.
- All the API is now based on JsonObject so you don't have to
play with
String -> Json -> String -> Json manipulation
---> would appreciate feedback on these 2 changes so that I can make
everything consistent
+1 very good that you avoided some forth-and-back cycles, that has
been bugging me, too.
- I introduced a ToElasticSearch class which translates
Lucene/HSearch
objects to Json. I moved the existing query translator there.
- I'm not sure building the exact same document we build for Lucene is the
way to go. I'm wondering if we should have a distinct document builder for
ES. I'll see how it works for facets not being fields.
Yes, keep in mind that the route through creating a Lucene Document
object is just temporary hack to have a PoC of integrating ES. For HS
6 we'll look into refactoring the required pieces and avoid that
intermediary step. Not sure how much sense it makes to spend cycles on
this right now, it'll be larger task for sure.
- I made a few fixes here and there: especially the indexing
didn't support
multiple values per field. I made something complicated, maybe we could
just index a JsonArray even for single values? see
https://github.com/gsmet/hibernate-search/commit/cb88b5790996442e137fab4a...
That's cool. Actually I had prepared that very thing locally, too, but
your impl is even nicer :) I wouldn't use arrays for single values,
the upgrade from single value to array if needed seems nicer.
Btw. that's one of the cases where going through the Lucene document
is limiting us. When having element collections of component types, we
cannot assemble the JSON array in such way that's ensured that each
array element represents one "row" from the collection. This is
because we don't know which of the Lucene fields belong together.
That'll change with the larger re-work in 6.
Also the case of root.components[0].foo, root.components[0].bar,
root.components[1].foo, root.components[1].bar where the array would
have to be at the level of "components" and not at the leave level.
Faceting
=======
- The querying/extraction should be OK
- ES doesn't support range bounds inclusion/exclusion for range faceting so
I used the query faceting to build the range faceting
- I have to work on the mapping as having multiple facets on the same field
is not supported yet: currently I use the indexed field to build the
facets, not specific fields
- I have some additional work to do on indexing related to the above matter
Various issues
===========
- A few tests don't pass:
. ElasticSearch returns facets results even if the field doesn't exist: not
sure we can do anything about that
. A test in the WebShopTest is failing as I don't support yet faceting
based on a non existing field
. CollectionUpdateEventTest is failing: this one is weird
- I had to work around an issue with the EsDateBridge and faceting (marked
as XXX in DocumentBuilderIndexedEntity): Dates are considered to be Long
and for ES, we want them to be Strings. Would appreciate a comment on that.
Couldn't you do the conversion in the ElasticSearchIndexWorkVisitor?
Comments and thoughts appreciated.
--
Guillaume
_______________________________________________
hibernate-dev mailing list
hibernate-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev