[Hawkular-dev] Low-impact clients and never dropping any events

Fri Feb 13 11:49:36 EST 2015

Hi Randall,

Answers inline.

Le 13/02/2015 16:12, Randall Hauch a écrit :
> Forgive my ignorance, but I’m new to the list and I didn’t see anything in the archives about $subject, detailed below. Lately I’ve been very interested in several topics ancillary to monitoring, so I’m quite intrigued by the planned direction and approach.
>
> How do clients/systems/services that are to be monitored actually send their monitorable information? What is the granularity of this information: is it already summarized or somewhat aggregated in the client, or is it very low-level and fine-grained events? What is the impact on the client of adding this extra overhead?

There are different options for sending:

1. External collectors
A collector running as independent process queries the monitored system, 
which exposes, somehow, runtime information. Then the collector sends 
the information Hawkular.
Examples: rhq agent, collectd, jmxtrans

2. Embedded collectors
Same as above, except that the collector runs in the same process as the 
monitored system.
Examples: Wildfly monitor, embedded-jmxtrans, codahale metrics (if 
configured with a reporter other than JMX)

3. Custom
Any solution which sends information to Hawkular without resorting to a 
collector.

Granularity is not enforced: at different points in time, you could send 
the values of a counter or send a locally computed derivative for the 
last minute.

The impact when sending information with a collector or embedded 
collector should be pretty low. Most of existing solutions do some sort 
of batching. With a custom sender, it all depends on how it's designed, 
obviously.

>
> Do you have an estimate or goal for how much volume of incoming data can be handled without impacting on clients? What, if anything, does a client submission wait for on the back-end?

Hawkular metrics is designed to be horizontally scalable so the volume 
of data you can absorb should depend on the number of machines you can 
throw in the game.

Most collectors buffer data to be sent and operate in separate threads. 
So if the metrics ingestion rate decreases, they'll consume more memory. 
Other than that, it should have limited impact on your service.

>
> Also, how do you plan to ensure that, no matter what happens to the Hawkular system or anything it depends upon, no client information is every lost or dropped?

Usually collectors will drop data once buffers are full. If you want to 
make sure no data is lost, then you need to build a custom sender. 
Hawkular metrics has an HTTP interface so the response code should tell 
you if a metric was successfully persisted.

>
> Finally, is the plan to make Hawkular embeddable (excluding the stuff that has to be embedded in monitored clients/systems/services), or only a separate turn-key (i.e., install-and-run-and-use) system?

Hawkular metrics comes in two forms:
* a Java library (metrics-core)
* a Java EE web application (built on top of the library)

metrics-core can be embedded in any sort of JVM application but it 
expects to find a Cassandra cluster somewhere.

I hope it helps. Feel free to ask for details.

And welcome to Hawkular!

Thomas