[hibernate-dev] Introducing a new API in Hibernate Search to "sign" HTTP requests to Elasticsearch

Yoann Rodiere yoann at hibernate.org
Fri Jun 2 09:21:45 EDT 2017


> Anyways, I've had my say :)

And I'm grateful you took the time to do it. I guess we both would have
preferred to agree on the same solution, but... at least we now have
something to process.

Yoann Rodière
Hibernate NoORM Team
yoann at hibernate.org

On 2 June 2017 at 15:11, Gunnar Morling <gunnar at hibernate.org> wrote:

> I reckon you are making fun of me. Anyways, I've had my say :)
>
> 2017-06-02 15:03 GMT+02:00 Yoann Rodiere <yoann at hibernate.org>:
> >> What's the benefit of this catch-up game?
> >
> > Not tainting our SPI with RestClient :)
> >
> > Yoann Rodière
> > Hibernate NoORM Team
> > yoann at hibernate.org
> >
> > On 2 June 2017 at 14:59, Gunnar Morling <gunnar at hibernate.org> wrote:
> >>
> >> Sure, as I said, an implementor of that SPI will very likely deal with
> >> the HTTP client. But our SPI isn't tainted by that.
> >>
> >> I also don't see much value in re-defining all the options from
> >> RestClientBuilder. ElasticsearchHttpClientConfigurer resembles
> >> HttpClientConfigCallback. Would you also re-define
> >> RequestConfigCallback? What if new options get added to
> >> RestClientBuilder? What's the benefit of this catch-up game?
> >>
> >>
> >>
> >>
> >> 2017-06-02 14:47 GMT+02:00 Yoann Rodiere <yoann at hibernate.org>:
> >> >> There's no exposure of HTTP Client in this SPI. Yes, if people need
> to
> >> >> customize the HTTP client to be used by the returned RestClient
> >> >> instance, they'll naturally depend on that. But this SPI isn't tied
> to
> >> >> such detail of how RestClient works - if ES folks decided to use
> >> >> OkHttp instead, our SPI contract won't be affected (of course user's
> >> >> implementations will need to change if they were customizing the HTTP
> >> >> client before).
> >> >
> >> > I would feel dishonest arguing this to users... Frankly, there's
> little
> >> > point in returning a custom RestClient without customizing stuff
> related
> >> > to
> >> > Apache HTTP Client. See RestClientBuilder: apart from methods tied to
> >> > Apache
> >> > HTTP Client, and from options already provided by Hibernate Search,
> you
> >> > only
> >> > have access to these:
> >> >
> >> > org.elasticsearch.client.RestClientBuilder.
> setDefaultHeaders(Header[])
> >> >
> >> > org.elasticsearch.client.RestClientBuilder.setFailureListener(
> FailureListener)
> >> > org.elasticsearch.client.RestClientBuilder.setPathPrefix(String)
> >> >
> >> > ... and that's all.
> >> >
> >> > So yes, this SPI doesn't have a direct dependency to Apache HTTP
> Client,
> >> > but
> >> > any practical use of it will have one. At the end of the day, that's
> >> > what
> >> > really matters, right?
> >> >
> >> > Yoann Rodière
> >> > Hibernate NoORM Team
> >> > yoann at hibernate.org
> >> >
> >> > On 2 June 2017 at 14:25, Gunnar Morling <gunnar at hibernate.org> wrote:
> >> >>
> >> >> 2017-06-02 12:49 GMT+02:00 Yoann Rodiere <yoann at hibernate.org>:
> >> >> >> There's an important difference: one exposes Apache HTTP client in
> >> >> >> HSEARCH's SPI, whereas the other just requires usage of Apache
> HTTP
> >> >> >> client within one specific implementation. For users it doesn't
> >> >> >> change
> >> >> >> much, but the latter is cleaner from HSEARCH's perspective.
> >> >> >
> >> >> > RestClient is not an interface, it's an implementation. There's no
> >> >> > interface. So yes, we would just be exposing Apache HTTP Client on
> >> >> > top
> >> >> > of
> >> >> > RestClient.
> >> >> >
> >> >>
> >> >> Here's what I'd have done:
> >> >>
> >> >>     package org.hibernate.search.elasticsearch.client.spi;
> >> >>
> >> >>     public interface RestClientFactory  {
> >> >>         RestClient buildRestClient(SomeContext ctx);
> >> >>     }
> >> >>
> >> >> There's no exposure of HTTP Client in this SPI. Yes, if people need
> to
> >> >> customize the HTTP client to be used by the returned RestClient
> >> >> instance, they'll naturally depend on that. But this SPI isn't tied
> to
> >> >> such detail of how RestClient works - if ES folks decided to use
> >> >> OkHttp instead, our SPI contract won't be affected (of course user's
> >> >> implementations will need to change if they were customizing the HTTP
> >> >> client before).
> >> >>
> >> >> >> I'm not quite following on that. If people are in control of the
> >> >> >> RestClient entirely, they can do whatever they want?
> >> >> >
> >> >> > As mentioned above, exposing RestClient is even worse than just
> >> >> > exposing
> >> >> > the
> >> >> > Apache HTTP client.
> >> >>
> >> >> I don't think it's worse, it's better actually. It just exposes our
> >> >> direct dependency in the SPI and not any of its details.
> >> >>
> >> >> > So I was suggesting to use a proper abstraction, namely our own
> >> >> > interface,
> >> >> > org.hibernate.search.elasticsearch.client.impl.
> ElasticsearchClient.
> >> >> >
> >> >> > I suppose your remark concerned this paragraph:
> >> >> >
> >> >> >> On top of that, this solution would not allow multiple third-party
> >> >> >> customizations to work well together (for instance AWS
> >> >> >> authentication
> >> >> >> provided by us or a third party + some performance tweaking by the
> >> >> >> user)...
> >> >> >> which is something the SPI we're planning in the PR could allow.
> >> >> >
> >> >> > If we did allow users to re-implement ElasticsearchClient, yes, one
> >> >> > implementor would be able to do whatever he wants... if he
> >> >> > re-implements
> >> >> > it.
> >> >> > He wouldn't be able to re-use other extensions.
> >> >> > Take for example the AWS authentication. You're suggestion that we
> >> >> > provide
> >> >> > an alternative client allowing to connect to AWS. Fine, we do that,
> >> >> > and
> >> >> > users can use it. But what if a users wants AWS authentication and
> >> >> > say,
> >> >> > configure a proxy? Then he can't reuse our AWS client, since this
> >> >> > client
> >> >> > is
> >> >> > just an implementation of ElasticsearchClient, and we don't want to
> >> >> > expose
> >> >> > anything related to Apache HTTP Client. So he must implement AWS
> >> >> > authentication. Just to configure a proxy.
> >> >> >
> >> >> > The thing is, there are tons of things a user may want to do with
> >> >> > Apache
> >> >> > HTTP Client, and we can't possibly provide access to each and every
> >> >> > option
> >> >> > through an abstraction layer.
> >> >>
> >> >> Right, hence I wouldn't bother to do that in the first place. Just
> let
> >> >> users customize how RestClient is instantiated. Let's them do all
> they
> >> >> want. We can provide examples which show how to do the AWS signing
> >> >> etc.
> >> >>
> >> >> > So at some point, if we want to allow
> >> >> > configuration (and in the case of an HTTP client, I'm afraid we
> have
> >> >> > to),
> >> >> > we'll have to expose internals. We just have to make sure this is
> >> >> > done
> >> >> > in a
> >> >> > controlled way (expose as little as possible).
> >> >> >
> >> >> >
> >> >> >
> >> >> > Yoann Rodière
> >> >> > Hibernate NoORM Team
> >> >> > yoann at hibernate.org
> >> >> >
> >> >> > On 2 June 2017 at 12:26, Gunnar Morling <gunnar at hibernate.org>
> wrote:
> >> >> >>
> >> >> >> 2017-06-02 11:58 GMT+02:00 Yoann Rodiere <yoann at hibernate.org>:
> >> >> >> >> Did you instead consider to just let users provide their custom
> >> >> >> >> instance of org.elasticsearch.client.RestClient? It's still
> >> >> >> >> leaking
> >> >> >> >> an
> >> >> >> >> implementation detail of Hibernate Search, but at least it's
> one
> >> >> >> >> indirection less.
> >> >> >> >
> >> >> >> >
> >> >> >> > The only way to add AWS authentication to the RestClient is to
> use
> >> >> >> > Apache
> >> >> >> > HTTP Client classes, so this solution would still bind users to
> >> >> >> > the
> >> >> >> > implementation details. We'd just shove it under the carpet :)
> >> >> >>
> >> >> >> There's an important difference: one exposes Apache HTTP client in
> >> >> >> HSEARCH's SPI, whereas the other just requires usage of Apache
> HTTP
> >> >> >> client within one specific implementation. For users it doesn't
> >> >> >> change
> >> >> >> much, but the latter is cleaner from HSEARCH's perspective.
> >> >> >> >
> >> >> >> > Also, I'd argue this would be an even bigger implementation
> leak:
> >> >> >> > at
> >> >> >> > least
> >> >> >> > with the current solution with can switch to an alternative
> >> >> >> > Elasticsearch
> >> >> >> > client, as long as we still use Apache HTTP Client. If we expose
> >> >> >> > RestClient,
> >> >> >> > we're stuck with it and whatever underlying technologies it
> uses.
> >> >> >> >
> >> >> >> > On the other hand, we could ask them to re-implement
> >> >> >> >
> >> >> >> > org.hibernate.search.elasticsearch.client.impl.
> ElasticsearchClient,
> >> >> >> > which is
> >> >> >> > our own façade over the Elasticsearch client. But I find it a
> bit
> >> >> >> > much
> >> >> >> > just
> >> >> >> > to tweak some settings or to add a new authentication scheme...
> >> >> >> > On top of that, this solution would not allow multiple
> third-party
> >> >> >> > customizations to work well together (for instance AWS
> >> >> >> > authentication
> >> >> >> > provided by us or a third party + some performance tweaking by
> the
> >> >> >> > user)...
> >> >> >> > which is something the SPI we're planning in the PR could allow.
> >> >> >>
> >> >> >> I'm not quite following on that. If people are in control of the
> >> >> >> RestClient entirely, they can do whatever they want?
> >> >> >>
> >> >> >> >
> >> >> >> > Yoann Rodière
> >> >> >> > Hibernate NoORM Team
> >> >> >> > yoann at hibernate.org
> >> >> >> >
> >> >> >> > On 2 June 2017 at 10:30, Sanne Grinovero <sanne at hibernate.org>
> >> >> >> > wrote:
> >> >> >> >>
> >> >> >> >> On Fri, 2 Jun 2017, 08:56 Gunnar Morling, <
> gunnar at hibernate.org>
> >> >> >> >> wrote:
> >> >> >> >>
> >> >> >> >> > Hi,
> >> >> >> >> >
> >> >> >> >> > I find the exposure of an implementation detail (usage of
> >> >> >> >> > Apache
> >> >> >> >> > HTTP
> >> >> >> >> > client) of the Elasticsearch client a bit problematic. If
> they
> >> >> >> >> > change
> >> >> >> >> > this to another HTTP client, our SPI would break.
> >> >> >> >> >
> >> >> >> >>
> >> >> >> >> Yes the very point of exposing that detail is the reason for
> this
> >> >> >> >> thread.
> >> >> >> >>
> >> >> >> >> Still, our SPI being guaranteed only for a minor, that gives a
> >> >> >> >> lot
> >> >> >> >> of
> >> >> >> >> flexibility?
> >> >> >> >>
> >> >> >> >> The Elasticsearch client exposing this itself, I don't expect
> >> >> >> >> them
> >> >> >> >> to
> >> >> >> >> switch implementation in a micro release to make some bugfix.
> If
> >> >> >> >> they
> >> >> >> >> change it in a major or even minor version, we're ok to not
> >> >> >> >> support
> >> >> >> >> that
> >> >> >> >> version until our next minor.
> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> > Did you instead consider to just let users provide their
> custom
> >> >> >> >> > instance of org.elasticsearch.client.RestClient? It's still
> >> >> >> >> > leaking
> >> >> >> >> > an
> >> >> >> >> > implementation detail of Hibernate Search, but at least it's
> >> >> >> >> > one
> >> >> >> >> > indirection less.
> >> >> >> >> >
> >> >> >> >> > People wishing to have a custom RestClient would have to
> >> >> >> >> > implement
> >> >> >> >> > a
> >> >> >> >> > few more bits themselves (the logic from
> >> >> >> >> > DefaultElasticsearchClientFactory#customizeHttpClientConfig(
> )),
> >> >> >> >> > but
> >> >> >> >> > I'd find that acceptable for the sake of a less
> detail-exposing
> >> >> >> >> > SPI,
> >> >> >> >> > plus it grants more flexibility in terms of configuring the
> >> >> >> >> > RestClient.
> >> >> >> >> >
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> N.B. The client factory already is a Service so any advanced
> user
> >> >> >> >> already
> >> >> >> >> can override it.
> >> >> >> >>
> >> >> >> >> We want to make it easier for cloud users. Focusing on AWS now
> as
> >> >> >> >> we've
> >> >> >> >> had
> >> >> >> >> user requests for this - not least our own CI - but I'd expect
> >> >> >> >> other
> >> >> >> >> clouds
> >> >> >> >> to have similar features (today or tomorrow). I just don't
> expect
> >> >> >> >> other
> >> >> >> >> use
> >> >> >> >> cases to need this so we might provide them all eventually, but
> >> >> >> >> at
> >> >> >> >> this
> >> >> >> >> point my goal is to leave an appealing SPI for contributors to
> >> >> >> >> join
> >> >> >> >> on
> >> >> >> >> that.
> >> >> >> >>
> >> >> >> >> With that I mean:
> >> >> >> >> 1# this might evolve but we need something simple to use for
> >> >> >> >> people
> >> >> >> >> to
> >> >> >> >> not
> >> >> >> >> get stuck.
> >> >> >> >> 2# I expect integrator implementors to contribute them back
> >> >> >> >> 3# People won't have this low level dependency in their
> projects
> >> >> >> >> for
> >> >> >> >> long
> >> >> >> >>
> >> >> >> >> Having them re-implement the client wouldn't encourage this ;)
> >> >> >> >>
> >> >> >> >> Thanks!
> >> >> >> >> Sanne
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> > --Gunnar
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > 2017-06-01 19:11 GMT+02:00 Sanne Grinovero
> >> >> >> >> > <sanne at hibernate.org>:
> >> >> >> >> > > Yoann has been working on allowing Hibernate Search users
> to
> >> >> >> >> > > use
> >> >> >> >> > > Elasticsearch on AWS.
> >> >> >> >> > >
> >> >> >> >> > > Specifically on AWS the Elasticsearch security can be
> >> >> >> >> > > configured
> >> >> >> >> > > to
> >> >> >> >> > > use application identities, which implies the requests need
> >> >> >> >> > > to
> >> >> >> >> > > be
> >> >> >> >> > > signed.
> >> >> >> >> > >
> >> >> >> >> > > A good background read can be found here [1].
> >> >> >> >> > >
> >> >> >> >> > > We planned to allow people to use this but were not
> planning
> >> >> >> >> > > to
> >> >> >> >> > > include AWS specific libraries as dependencies - but since
> >> >> >> >> > > Yoann
> >> >> >> >> > > implemented an actual AWS signer in the tests I suppose it
> >> >> >> >> > > would
> >> >> >> >> > > be
> >> >> >> >> > > selfish to not ship it..
> >> >> >> >> > >
> >> >> >> >> > > Please see the API proposal on github (with the PR):
> >> >> >> >> > >  - https://github.com/hibernate/hibernate-search/pull/1426
> >> >> >> >> > >
> >> >> >> >> > > Thanks,
> >> >> >> >> > > Sanne
> >> >> >> >> > >
> >> >> >> >> > > [1] -
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > https://aws.amazon.com/blogs/security/how-to-control-
> access-to-your-amazon-elasticsearch-service-domain/
> >> >> >> >> > > _______________________________________________
> >> >> >> >> > > hibernate-dev mailing list
> >> >> >> >> > > hibernate-dev at lists.jboss.org
> >> >> >> >> > > https://lists.jboss.org/mailman/listinfo/hibernate-dev
> >> >> >> >> >
> >> >> >> >> _______________________________________________
> >> >> >> >> hibernate-dev mailing list
> >> >> >> >> hibernate-dev at lists.jboss.org
> >> >> >> >> https://lists.jboss.org/mailman/listinfo/hibernate-dev
> >> >> >> >
> >> >> >> >
> >> >> >
> >> >> >
> >> >
> >> >
> >
> >
>


More information about the hibernate-dev mailing list