[hibernate-dev] Introducing a new API in Hibernate Search to "sign" HTTP requests to Elasticsearch

Yoann Rodiere yoann at hibernate.org
Fri Jun 2 08:50:47 EDT 2017


I made a mistake, Header is an Apache HTTP Client type and FailureListener
depends on one. So we're down to one useful customization, which we should
probably support as an Hibernate Search option by the way:

   - org.elasticsearch.client.RestClientBuilder.setPathPrefix(String)


Yoann Rodière
Hibernate NoORM Team
yoann at hibernate.org

On 2 June 2017 at 14:47, Yoann Rodiere <yoann at hibernate.org> wrote:

> > There's no exposure of HTTP Client in this SPI. Yes, if people need to
> > customize the HTTP client to be used by the returned RestClient
> > instance, they'll naturally depend on that. But this SPI isn't tied to
> > such detail of how RestClient works - if ES folks decided to use
> > OkHttp instead, our SPI contract won't be affected (of course user's
> > implementations will need to change if they were customizing the HTTP
> > client before).
>
> I would feel dishonest arguing this to users... Frankly, there's little
> point in returning a custom RestClient without customizing stuff related to
> Apache HTTP Client. See RestClientBuilder: apart from methods tied to
> Apache HTTP Client, and from options already provided by Hibernate Search,
> you only have access to these:
>
>    - org.elasticsearch.client.RestClientBuilder.
>    setDefaultHeaders(Header[])
>    - org.elasticsearch.client.RestClientBuilder.setFailureListener(
>    FailureListener)
>    - org.elasticsearch.client.RestClientBuilder.setPathPrefix(String)
>
> ... and that's all.
>
> So yes, this SPI doesn't have a *direct* dependency to Apache HTTP
> Client, but any practical use of it will have one. At the end of the day,
> that's what really matters, right?
>
> Yoann Rodière
> Hibernate NoORM Team
> yoann at hibernate.org
>
> On 2 June 2017 at 14:25, Gunnar Morling <gunnar at hibernate.org> wrote:
>
>> 2017-06-02 12:49 GMT+02:00 Yoann Rodiere <yoann at hibernate.org>:
>> >> There's an important difference: one exposes Apache HTTP client in
>> >> HSEARCH's SPI, whereas the other just requires usage of Apache HTTP
>> >> client within one specific implementation. For users it doesn't change
>> >> much, but the latter is cleaner from HSEARCH's perspective.
>> >
>> > RestClient is not an interface, it's an implementation. There's no
>> > interface. So yes, we would just be exposing Apache HTTP Client on top
>> of
>> > RestClient.
>> >
>>
>> Here's what I'd have done:
>>
>>     package org.hibernate.search.elasticsearch.client.spi;
>>
>>     public interface RestClientFactory  {
>>         RestClient buildRestClient(SomeContext ctx);
>>     }
>>
>> There's no exposure of HTTP Client in this SPI. Yes, if people need to
>> customize the HTTP client to be used by the returned RestClient
>> instance, they'll naturally depend on that. But this SPI isn't tied to
>> such detail of how RestClient works - if ES folks decided to use
>> OkHttp instead, our SPI contract won't be affected (of course user's
>> implementations will need to change if they were customizing the HTTP
>> client before).
>>
>> >> I'm not quite following on that. If people are in control of the
>> >> RestClient entirely, they can do whatever they want?
>> >
>> > As mentioned above, exposing RestClient is even worse than just
>> exposing the
>> > Apache HTTP client.
>>
>> I don't think it's worse, it's better actually. It just exposes our
>> direct dependency in the SPI and not any of its details.
>>
>> > So I was suggesting to use a proper abstraction, namely our own
>> interface,
>> > org.hibernate.search.elasticsearch.client.impl.ElasticsearchClient.
>> >
>> > I suppose your remark concerned this paragraph:
>> >
>> >> On top of that, this solution would not allow multiple third-party
>> >> customizations to work well together (for instance AWS authentication
>> >> provided by us or a third party + some performance tweaking by the
>> >> user)...
>> >> which is something the SPI we're planning in the PR could allow.
>> >
>> > If we did allow users to re-implement ElasticsearchClient, yes, one
>> > implementor would be able to do whatever he wants... if he
>> re-implements it.
>> > He wouldn't be able to re-use other extensions.
>> > Take for example the AWS authentication. You're suggestion that we
>> provide
>> > an alternative client allowing to connect to AWS. Fine, we do that, and
>> > users can use it. But what if a users wants AWS authentication and say,
>> > configure a proxy? Then he can't reuse our AWS client, since this
>> client is
>> > just an implementation of ElasticsearchClient, and we don't want to
>> expose
>> > anything related to Apache HTTP Client. So he must implement AWS
>> > authentication. Just to configure a proxy.
>> >
>> > The thing is, there are tons of things a user may want to do with Apache
>> > HTTP Client, and we can't possibly provide access to each and every
>> option
>> > through an abstraction layer.
>>
>> Right, hence I wouldn't bother to do that in the first place. Just let
>> users customize how RestClient is instantiated. Let's them do all they
>> want. We can provide examples which show how to do the AWS signing
>> etc.
>>
>> > So at some point, if we want to allow
>> > configuration (and in the case of an HTTP client, I'm afraid we have
>> to),
>> > we'll have to expose internals. We just have to make sure this is done
>> in a
>> > controlled way (expose as little as possible).
>> >
>> >
>> >
>> > Yoann Rodière
>> > Hibernate NoORM Team
>> > yoann at hibernate.org
>> >
>> > On 2 June 2017 at 12:26, Gunnar Morling <gunnar at hibernate.org> wrote:
>> >>
>> >> 2017-06-02 11:58 GMT+02:00 Yoann Rodiere <yoann at hibernate.org>:
>> >> >> Did you instead consider to just let users provide their custom
>> >> >> instance of org.elasticsearch.client.RestClient? It's still
>> leaking an
>> >> >> implementation detail of Hibernate Search, but at least it's one
>> >> >> indirection less.
>> >> >
>> >> >
>> >> > The only way to add AWS authentication to the RestClient is to use
>> >> > Apache
>> >> > HTTP Client classes, so this solution would still bind users to the
>> >> > implementation details. We'd just shove it under the carpet :)
>> >>
>> >> There's an important difference: one exposes Apache HTTP client in
>> >> HSEARCH's SPI, whereas the other just requires usage of Apache HTTP
>> >> client within one specific implementation. For users it doesn't change
>> >> much, but the latter is cleaner from HSEARCH's perspective.
>> >> >
>> >> > Also, I'd argue this would be an even bigger implementation leak: at
>> >> > least
>> >> > with the current solution with can switch to an alternative
>> >> > Elasticsearch
>> >> > client, as long as we still use Apache HTTP Client. If we expose
>> >> > RestClient,
>> >> > we're stuck with it and whatever underlying technologies it uses.
>> >> >
>> >> > On the other hand, we could ask them to re-implement
>> >> > org.hibernate.search.elasticsearch.client.impl.ElasticsearchClient,
>> >> > which is
>> >> > our own façade over the Elasticsearch client. But I find it a bit
>> much
>> >> > just
>> >> > to tweak some settings or to add a new authentication scheme...
>> >> > On top of that, this solution would not allow multiple third-party
>> >> > customizations to work well together (for instance AWS authentication
>> >> > provided by us or a third party + some performance tweaking by the
>> >> > user)...
>> >> > which is something the SPI we're planning in the PR could allow.
>> >>
>> >> I'm not quite following on that. If people are in control of the
>> >> RestClient entirely, they can do whatever they want?
>> >>
>> >> >
>> >> > Yoann Rodière
>> >> > Hibernate NoORM Team
>> >> > yoann at hibernate.org
>> >> >
>> >> > On 2 June 2017 at 10:30, Sanne Grinovero <sanne at hibernate.org>
>> wrote:
>> >> >>
>> >> >> On Fri, 2 Jun 2017, 08:56 Gunnar Morling, <gunnar at hibernate.org>
>> wrote:
>> >> >>
>> >> >> > Hi,
>> >> >> >
>> >> >> > I find the exposure of an implementation detail (usage of Apache
>> HTTP
>> >> >> > client) of the Elasticsearch client a bit problematic. If they
>> change
>> >> >> > this to another HTTP client, our SPI would break.
>> >> >> >
>> >> >>
>> >> >> Yes the very point of exposing that detail is the reason for this
>> >> >> thread.
>> >> >>
>> >> >> Still, our SPI being guaranteed only for a minor, that gives a lot
>> of
>> >> >> flexibility?
>> >> >>
>> >> >> The Elasticsearch client exposing this itself, I don't expect them
>> to
>> >> >> switch implementation in a micro release to make some bugfix. If
>> they
>> >> >> change it in a major or even minor version, we're ok to not support
>> >> >> that
>> >> >> version until our next minor.
>> >> >>
>> >> >>
>> >> >>
>> >> >> > Did you instead consider to just let users provide their custom
>> >> >> > instance of org.elasticsearch.client.RestClient? It's still
>> leaking
>> >> >> > an
>> >> >> > implementation detail of Hibernate Search, but at least it's one
>> >> >> > indirection less.
>> >> >> >
>> >> >> > People wishing to have a custom RestClient would have to
>> implement a
>> >> >> > few more bits themselves (the logic from
>> >> >> > DefaultElasticsearchClientFactory#customizeHttpClientConfig()),
>> but
>> >> >> > I'd find that acceptable for the sake of a less detail-exposing
>> SPI,
>> >> >> > plus it grants more flexibility in terms of configuring the
>> >> >> > RestClient.
>> >> >> >
>> >> >>
>> >> >>
>> >> >> N.B. The client factory already is a Service so any advanced user
>> >> >> already
>> >> >> can override it.
>> >> >>
>> >> >> We want to make it easier for cloud users. Focusing on AWS now as
>> we've
>> >> >> had
>> >> >> user requests for this - not least our own CI - but I'd expect other
>> >> >> clouds
>> >> >> to have similar features (today or tomorrow). I just don't expect
>> other
>> >> >> use
>> >> >> cases to need this so we might provide them all eventually, but at
>> this
>> >> >> point my goal is to leave an appealing SPI for contributors to join
>> on
>> >> >> that.
>> >> >>
>> >> >> With that I mean:
>> >> >> 1# this might evolve but we need something simple to use for people
>> to
>> >> >> not
>> >> >> get stuck.
>> >> >> 2# I expect integrator implementors to contribute them back
>> >> >> 3# People won't have this low level dependency in their projects for
>> >> >> long
>> >> >>
>> >> >> Having them re-implement the client wouldn't encourage this ;)
>> >> >>
>> >> >> Thanks!
>> >> >> Sanne
>> >> >>
>> >> >>
>> >> >> > --Gunnar
>> >> >> >
>> >> >> >
>> >> >> > 2017-06-01 19:11 GMT+02:00 Sanne Grinovero <sanne at hibernate.org>:
>> >> >> > > Yoann has been working on allowing Hibernate Search users to use
>> >> >> > > Elasticsearch on AWS.
>> >> >> > >
>> >> >> > > Specifically on AWS the Elasticsearch security can be
>> configured to
>> >> >> > > use application identities, which implies the requests need to
>> be
>> >> >> > > signed.
>> >> >> > >
>> >> >> > > A good background read can be found here [1].
>> >> >> > >
>> >> >> > > We planned to allow people to use this but were not planning to
>> >> >> > > include AWS specific libraries as dependencies - but since Yoann
>> >> >> > > implemented an actual AWS signer in the tests I suppose it
>> would be
>> >> >> > > selfish to not ship it..
>> >> >> > >
>> >> >> > > Please see the API proposal on github (with the PR):
>> >> >> > >  - https://github.com/hibernate/hibernate-search/pull/1426
>> >> >> > >
>> >> >> > > Thanks,
>> >> >> > > Sanne
>> >> >> > >
>> >> >> > > [1] -
>> >> >> >
>> >> >> >
>> >> >> > https://aws.amazon.com/blogs/security/how-to-control-access-
>> to-your-amazon-elasticsearch-service-domain/
>> >> >> > > _______________________________________________
>> >> >> > > hibernate-dev mailing list
>> >> >> > > hibernate-dev at lists.jboss.org
>> >> >> > > https://lists.jboss.org/mailman/listinfo/hibernate-dev
>> >> >> >
>> >> >> _______________________________________________
>> >> >> hibernate-dev mailing list
>> >> >> hibernate-dev at lists.jboss.org
>> >> >> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>> >> >
>> >> >
>> >
>> >
>>
>
>


More information about the hibernate-dev mailing list