[Hawkular-dev] need discussion around how we are going to do domain support

Mon Dec 4 07:26:38 EST 2017

On Mon, Dec 4, 2017 at 12:44 PM, John Mazzitelli <mazz at redhat.com> wrote:

>
>
> ----- Original Message -----
> > > > Spent time trying to figure out how to support WildFly domain mode
> with
> > > > the new stuff. It's not going well. Next week I will need to have
> some
> > > > discussions on how we want to do this.
> > > >
> > > > The issues with domain mode (I'll try to be short):
> > > >
> > > > 1) host controllers do not emit JMX metrics for slave servers - just
> not
> > > > implemented in WildFly. Need to go to slave servers for their own JMX
> > > > metrics.
> > > > 2) slave servers do not have managed interfaces available remotely
> (can't
> > > > connect to slaves over remote management interface - so agent can't
> get
> > > > any
> > > > inventory from slaves - have to get all inventory from host
> controller)
> > > >
> > > > This means our agent in the host controller needs to collect all
> > > > inventory
> > > > for both master host controller and slave servers.
> > > >
> > > > But we need the slave servers to have a metrics endpoint because its
> only
> > > > in the slaves where we can get JMX metrics (remember, host controller
> > > > can't
> > > > give us that, we must scrape the slave servers for the JMX metrics).
> > > >
> > > > If we do not have our agent in slave server (why bother if we can't
> > > > connect to it over DMR API to get inventory?), we still need to
> somehow
> > > > get
> > > > the P8s JMX Exporter installed in the slaves. But if we just use the
> raw
> > > > jmx exporter agent, how do we tell our h-services server/P8s server
> about
> > > > the new endpoint that needs to be scraped? And how do we get the jmx
> > > > exporter yml config to install there?
> > > >
> > > > So I suppose we should put our agent in a kind of "metrics only"
> mode in
> > > > all slave servers so it can expose the JMX exporter and pull down the
> > > > correct jmx exporter yml from h-services server and have it tell the
> > > > server
> > > > to add the scrape endpoint to p8s.
> > > >
> > > > But because we aren't getting inventory from slave servers, how can
> our
> > > > agent tell the server to add the scrape endpoint? Our current
> > > > implementation only adds scrape endpoints to our P8s server when new
> > > > agents
> > > > go into inventory. Workaround: either have agent store a small
> inventory
> > > > (just the agent resource itself) which triggers the new scrape
> endpoint
> > > > addition, or add a inventory REST endpoint for the agent to call to
> add
> > > > the
> > > > new scrape endpoint manually.
> > > >
> > > > OK, assume we have all this. How do we link the JMX metrics getting
> > > > collected from one feed (the agent in the slave server) to the
> inventory
> > > > from another feed (the agent in the host controller). Right now, it
> is
> > > > assumed the inventory metadata from feed "A" is matched with metrics
> from
> > > > the same feed. Now that we have to break that link (feed A's
> inventory
> > > > refers to feed B's metrics) we need to figure out how to fix this.
> > > >
> > > > There are other ancillary issues - like how do I get the correct
> metadata
> > > > defined for host controller so it can match resources/metrics from
> the
> > > > slaves. I assume that will be "implementation details."
> > > >
> > > > I'm sure this sounds like gibberish, but that's how convoluted
> supporting
> > > > domain mode is going to be.
> > > >
> > >
> > > Mazz,
> > >
> > > What about to promote the host controller agent as the responsible for
> > > everything ?
> > >
> > > - Host controller agent will be responsible to collect info of the
> domain
> > > (slave servers) and write into the inventory (Domain, Server, etc).
> > > - Slave server can have a "metrics-only" endpoint (perhaps same agent,
> > > perhaps a *-domain.jar if we want to simplify things).
> > > - Host controller agent can proxy slave metrics endpoint, so we can
> control
> > > endpoints like /metrics-domain/<slave>/metrics and that is what
> inventory
> > > uses to create the endpoint.
> > > - Host controller will expose in a proxy way, the metrics endpoint for
> the
> > > slaves.
> > >
> > > This approach focus the complixity in the host controller, but let to
> not
> > > have exceptions and corner use cases in the whole system, the benefit
> I see
> > > is that from MiQ and Hawkular Services we can try to maintain and
> uniform
> > > way.
> > >
> > > So, as you point Host Controllers needs to know the topology of the
> domain,
> > > and each slave should have a local /metrics which can be proxied from
> the
> > > host controller main endpoint.
> > > On this case, both Host Controllers and Slaves will share same feed,
> > > because, they are using the same agent, and that won't break any
> design use
> > > case.
> > >
> > > Also, I think that this proxy has a minimal technical complexity, as
> the
> > > host controller should have all info to expose that.
> >
> >
> > This has some promise. Just need to think about how to configure this
> proxy
> > and implement it without having to rip apart too much of the internals of
> > the agent code.
> >
> > Thinking out loud:
> >
> > We add "proxies" section to the agent config's metrics-exporters:
> >
> > metrics-exporter:
> >   enabled: true
> >   host: ${hawkular.agent.metrics.host:127.0.0.1}
> >   port: ${hawkular.agent.metrics.port:9779}
> >   config-dir: ${jboss.server.config.dir}
> >   config-file: WF10
> >   # HERE IS THE NEW SECTION
> >   proxies:
> >   - path: slave1/metrics
> >     host: 127.0.0.1
> >     port: 9780
> >   - path: slave2/metrics
> >     host: 127.0.0.1
> >     port: 9781
> >   - path: slave3/metrics
> >     host: 127.0.0.1
> >     port: 9782
> >
> > We would have to enhance our jmx exporter wrapper code so it supports
> those
> > extra endpoints (it would simply just pass-through request/responses to
> > those different endpoints).
> >
> > The slave agent configs would have their jmx exporter sections the same
> as
> > always and bind to localhost:
> >
> > metrics-exporter:
> >   enabled: true
> >   host: 127.0.0.1
> >   port: ${hawkular.agent.metrics.port:9779}
> >   config-dir: ${jboss.server.config.dir}
> >   config-file: WF10
> >
> > The nice thing here is they only need to expose the metric data on the
> > loopback address, 127.0.0.1, since the host controller agent will be on
> the
> > same box - no need to open this up to the wider network.
> >
> > The bad thing is the proxies section requires you to know how many slaves
> > there are and what their ports are going to be. There is no way to know
> that
> > ahead of time (this is all encoded in the host controller's config, and
> who
> > knows what the user wants to use - you can start host controllers with
> > --host-config option to specify a custom .xml). And besides, new slaves
> can
> > be added to the host controller at any time.
> >
> > So it seems there needs to be some kind of auto-discovery of proxies,
> not a
> > fixed configuration of them. Maybe we have a section that tells the
> agent to
> > check for metric endpoints to be proxied?
> >
> >   proxies:
> >   - host: 127.0.0.1
> >     port-range: 9780-9880
> >
> > But if we have auto-discover like that, what is the proxy /path that we
> need
> > to tell Prometheus to scrape? We need to distinguish it somehow so P
> knows
> > "to get slave #1, I scrape host:9779/slave1/metrics and to scrape slave
> #2
> > its host:9779/slave2/metrics the host controller its just
> host:9779/metrics
> > ??
> >
> > Maybe we tell the host controller agent to talk to the slave agent
> somehow?
> > Over remote JMX is an interesting idea (the agent already has its own JMX
> > MBean, as long as we can talk to it over remote JMX, we can have the host
> > agent ask the slave agent for information about its metrics endpoint
> ("what
> > is your host/port of /metrics endpoint?". We could possibly then
> associate
> > it with the name of the slave and use that as the proxy path?
> >
> > We could then add some kind of "flag" or metadata on the "Domain WildFly
> > Server" resource to say, "this resource has an agent whose metrics
> endpoint
> > we want to proxy - go ask this agent for its details". So when the host
> > agent runs its normal inventory scan, we can add special code to say,
> "when
> > you hit a resource that has this special proxy metadata, you need to
> proxy
> > that resource's agent's metrics endpoint." Because the Domain WildFly
> Server
> > is a DMR resource - there is no associated JMX Server information with
> it.
> > We would need to configure that somehow.
> >
> >   - name: Domain WildFly Server
> >     resource-name-template: "%-"
> >     path: "/server=*"
> >     parents:
> >     - Domain Host
> >     proxy-metrics: true
> >
> > Oh, BTW, because the agent's remote JMX client is the jolokia client, we
> need
> > to install a jolokia -javaagent in the slave server along with our agent.
> > We'd need to add some additional information there so the host agent
> knows
> > all the details so it can connect to the JMX server where the resource is
> > hosted. How does it know the jolokia URL of the slave server?
>
>
> Maybe we forget all this remote JMX talk from host agent to slave agent
> and just use the file system? We know the slaves are on the same box as the
> host controller - so perhaps we tell each slave, "write some information
> the host agent needs so it can proxy your metrics endpoint". This has the
> added benefit of having that auto-discovery I said we'd need. The host
> controller simply proxies whatever information it detects on the filesystem
> (maybe each slave writes a file whose name is the name of the slave (so all
> the files are unique) and in each file is information like the host and
> port where the /metrics endpoint is. Any file the host agent finds it will
> proxy. The agent therefore would just need to be told, "look in this
> directory - as you see slaves writing their proxy files, start proxying
> their info."
>
>
Using filesystem to share info between slaves sounds ok for this use case (
domain/tmp, or similar)

> The question then becomes, how does the agent tell the h-server, "let P
> know that it needs to scrape this endpoint"? I think we are back to the
> "have the agent manually tell the h-server to tell P to register a metrics
> endpoint" - what is provided by this commit in my local branch:
> https://github.com/jmazzitelli/hawkular-commons/commit/
> ae8d316486be7ba738f63b666c99a4f5a2e61f60
>
>
Why don't just upload the agent with the config of the new slaves ?

Today, when an agent is added/modified (is the same use case in practical
terms), the p8s are generated.

So, if we know we have a modification, just add the slave endpoints into
the agent config, and just let the inventory re-create them.

> _______________________________________________
> hawkular-dev mailing list
> hawkular-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hawkular-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/hawkular-dev/attachments/20171204/886c7fb5/attachment-0001.html