On 3 March 2016 at 15:11, Emmanuel Bernard <emmanuel(a)hibernate.org> wrote:
On Thu 2016-03-03 14:19, Sanne Grinovero wrote:
> Back to my Elasticsearch problem: I think the short term solution
> would be that we actually have to check for the index state after
> having it created, and keep checking in a loop until some short
> timeout expires.
> ->
https://hibernate.atlassian.net/browse/HSEARCH-2146
Sounds like a reasonable first approach.
>
> But I don't think that's the right design to pursue in the longer
> term; especially as we're not dealing with the fact that the index
> might "downgrade" its state at any time after we started.
>
> I think we need to check for "cluster health" headers regularly and
> monitor for acceptance for each command we send, keeping track of its
> health and probably keeping a persistent queue around for the
> operations which couldn't be applied yet.
>
> Our current design will throw a SearchException to the end user, which
> I don't think is practical..
Doesn't it call the error report API?
Not yet, but yes it should.
-
https://hibernate.atlassian.net/browse/HSEARCH-2151
The idea of a persistent queue opens up a lot of complexity so
I'm not
sure that's where we want to go - besides the existing master/slave +
JMS and whatever alternative we plan on implementing down the road.
My point is to wonder what the user expectation is when the ES cluster
goes down:
1. have HSearch be magical and keep data up in the air until the cluster
comes back up - if it even does
2. have HSearch report on indexing errors so that one can take reindexing
actions
3. do like a manual user implemented integration and pretend distributed
systems always work
I think 2 is the practical approach.
I liked 1 better :)
But ok, I think we all agree that 2# is the goal for version 5.6.
> History might have shown that the current approach is fine with
> Hibernate ORM, but:
> - requirements and expectations evolve, people might soon expect more
> - I suspect that with RDBMs's this is less of a need than with the
> new crop of dynamic, self healing distributed systems we're dealing
> with.
On a philosophical note, can a client expect to tolerate schema
changing, temporarily unavailable server system at any time, for any
length of time? I think that's what you expect HSearch to do in a way.
My answer is no and the degraded mode requiring reindexing is an
acceptable trade-off.
But +1 to be more lenient at startup time and "wait a bit more than
expected".
I don't think Hibernate Search should be able to handle anything you
can throw at it
but it should be able to deal with a "standard" outage of a couple of servers.
Anyway, in the range from not handling any failure to handle them all,
I think expectations are moving higher than what we're handling today so I
agree with Gunnar on needing to work on queues sometimes soon;
to me pushing this complexity to the end users feels like cheating:
taking the glory of the integration merits but not being willing to
face the actual challenges.
The Errorhandler is our way of "washing our hands" out of it.. it's a
good thing that at least we provide a hook, but I don't think it's
enough.
Personally I don't like JMS as in my experience it's not easy for
Search code to help setting up the queues correctly even validate the
configuration, or haven't found a standard way to do so.
But yes any kind of queue technology could work.. maybe we should just
generate Camel events.
> Not least, I noticed that the Elasticsearch native client
actually
> enters the cluster as a member of it. That's similar to an Infinispan
> client using zero-weight vs a REST client.. Infinispan experts will
> understand there are significant different capabilities. I don't mean
> to remove the JEST client approach but we might want to study more of
> the consequences of that, and help me define what we will consider an
> acceptable tradeoff while still using the REST client approach.
What do you have in mind? As a layman, I'd say using the native Java
client will be "better" performance wise and could be the preferred
approach.
Yes it will likely have better performance but it would bind us to a
specific version of Elasticsearch and its included Lucene version.
All of the current work is based on using REST calls over an
alternative client instead.
We might want to eventually revisit this, and by then I hope we'll
have a clearer idea of what benefits or new problems we'd have to be
part of the ES cluster as a participant node.