[hibernate-dev] Connecting to a resource which isn't ready yet (or not ready anymore)

Thu Mar 3 09:55:40 EST 2016

Hi,

I think waiting until the index has become available after creation is
fine for the time being. I'd wait and see how practical experiences
look like, i.e. how long does it take in practice to create indexes
with a realistic number of shards and replicas.

Also we discussed creating a separate tool akin to "schema creator",
which users can run whenever they like. So I think we have some
options here.

Regarding coming and going of the cluster at runtime, I am not so
concerned. Again I'd wait and see how realistic that a problem is. I
doubt it's a huge problem in practice, otherwise it'd render the
(synchronous) API useless to begin with. Sure, one can argue
synchronous interfaces are not appropriate for any system-to-system
communication. Actually I am arguing like that for a long time ;)

But that's exactly why I think it's great to combine the ES backend
with the JMS worker and a persistent queue: Messages can be
reprocessed whenever the cluster is back. That's a tool we already
provide and users can make use of it if they like. Of course we can
provide workers based on different techs (Kafka, AMQP, you name it),
but I don't think another general mechanism is needed. That's one
great advantage of using our integration IMO.

Regarding REST vs. native, I'd also wait for actual experiences.
Native is not an option for non-Java apps, so I have a hard time
believing REST will be prohibitively slow. If we do proper bulking it
might be good enough.

--Gunnar

2016-03-03 15:19 GMT+01:00 Sanne Grinovero <sanne at hibernate.org>:
> My question here was triggered by a specific case in Hibernate Search
> but it applies well to ORM's datasources, caches, and very much to OGM
> as well.
>
> When creating an index on Elasticsearch, the index is not
> "instantaneously" ready.
> The REST request creates the definition, but the response will only
> tell if the request to create it was accepted. Elasticsearch will then
> start the creation process and gradually upgrades the index status to
> "yellow" and finally "green" depending on its ability to quickly
> propagate the needed changes across the clusters; it might even fail
> and get to a status "Red".
>
> Our current approach is:
>  - send a request to define the index
>  - start using it
>
> Which is probably following our traditional pattern with static
> systems, but becomes a naive approach in this modern world of dynamic
> services.
> Our approach works *almost* fine in the singleton nodes which we're
> using for testing but it's not suited for a real cluster.
>
> Even in our integration tests, it might happen that we didn't give it
> enough time to boot.
>
> Someone else asked me recently how to make Hibernate ORM not fail to
> boot when he's starting VMs containing the database and the
> application in parallel or in no specific order: sometimes it would
> happen that ORM would attempt to connect before the RDBMs would have
> started; all he needed was to have ORM stall and wait some seconds.
> Often even starting the RDBMs VM first isn't good enough, as the VM
> reports "started" but the DB might be needing to finish some
> maintenance tasks.. Kubernetes provides hooks to check for actual
> services to being ready but people seem to expect that Hibernate could
> deal with some basics too.
>
> In that case AFAIR my suggestion was that this could be solved as an
> implementation detail of the datasource / connection pool.
>
> Back to my Elasticsearch problem: I think the short term solution
> would be that we actually have to check for the index state after
> having it created, and keep checking in a loop until some short
> timeout expires.
>  -> https://hibernate.atlassian.net/browse/HSEARCH-2146
>
> But I don't think that's the right design to pursue in the longer
> term; especially as we're not dealing with the fact that the index
> might "downgrade" its state at any time after we started.
>
> I think we need to check for "cluster health" headers regularly and
> monitor for acceptance for each command we send, keeping track of its
> health and probably keeping a persistent queue around for the
> operations which couldn't be applied yet.
>
> Our current design will throw a SearchException to the end user, which
> I don't think is practical..
>
> History might have shown that the current approach is fine with
> Hibernate ORM, but:
>  - requirements and expectations evolve, people might soon expect more
>  - I suspect that with RDBMs's this is less of a need than with the
> new crop of dynamic, self healing distributed systems we're dealing
> with.
>
> Not least, I noticed that the Elasticsearch native client actually
> enters the cluster as a member of it. That's similar to an Infinispan
> client using zero-weight vs a REST client.. Infinispan experts will
> understand there are significant different capabilities. I don't mean
> to remove the JEST client approach but we might want to study more of
> the consequences of that, and help me define what we will consider an
> acceptable tradeoff while still using the REST client approach.
>
> Thanks for any thoughts,
> Sanne
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev