[hibernate-dev] [Search] ElasticSearch - Progress on faceting
guillaume.smet at gmail.com
Fri Feb 5 05:33:19 EST 2016
On Thu, Feb 4, 2016 at 7:01 PM, Gunnar Morling <gunnar at hibernate.org> wrote:
> > - I introduced a JsonBuilder to make the gson API chainable: I think it's
> > more readable this way. I haven't updated everything to use it
> > the Lucene Query -> Json part is not ported yet). Should I continue with
> > it? Please, please, say yes :).
> It's a great example of how subjective readability is, because I don't
> find it necessarily better than the previous version, it's just
> different :) An actual disadvantage is the higher number of object
> allocations due to the builders.
> In the end I don't really care about it that much; how the JSON is
> created is more a technical detail which we may change at any time. So
> if you feel strongly about it, feel free to do it. But I would not
> expose this builder and the usage of GSON through the public API so we
> actually have the freedom of future changes.
Maybe readability wasn't the best term. "Writeability" is better as what I
really like about it is the capacity to write the code as I would write the
JSON instead of writing it in the reverse way when I have to nest objects.
> > - I introduced a ToElasticSearch class which translates Lucene/HSearch
> > objects to Json. I moved the existing query translator there.
> > - I'm not sure building the exact same document we build for Lucene is
> > way to go. I'm wondering if we should have a distinct document builder
> > ES. I'll see how it works for facets not being fields.
> Yes, keep in mind that the route through creating a Lucene Document
> object is just temporary hack to have a PoC of integrating ES. For HS
> 6 we'll look into refactoring the required pieces and avoid that
> intermediary step. Not sure how much sense it makes to spend cycles on
> this right now, it'll be larger task for sure.
OK. We're on the same page.
I think we won't be able to support the WebShopTest without reworking that.
We have this:
@Facet(name = "cubicCapacity", forField = "cubicCapacity",
encoding = FacetEncodingType.STRING),
@Facet(name = "cubicCapacity_Numeric", forField =
"cubicCapacity", encoding = FacetEncodingType.LONG)
The original Lucene document would probably allow to do something about it
but we lose a lot of information due to the call to doc =
facetConfig.build( doc ) which applies the Lucene facet magic, removes
interesting information and adds less qualified information.
> > - I had to work around an issue with the EsDateBridge and faceting
> > as XXX in DocumentBuilderIndexedEntity): Dates are considered to be Long
> > and for ES, we want them to be Strings. Would appreciate a comment on
> Couldn't you do the conversion in the ElasticSearchIndexWorkVisitor?
Not really. The preexisting code was failing with a class cast exception.
That's why I had to fix it this way. I think the indexing type (STRING,
LONG) should be determined per backend instead of being global.
Should we create a JIRA issue to centralize all the API challenges related
to ES for 6?
More information about the hibernate-dev