Hi
Wanted to discuss a proposal for recording some metric data captured from Hawkular APM in
Hawkular Metrics.
For those not familiar with Hawkular APM, it captures the end to end trace instance (think
of it as a distributed call stack), for each invocation of an application. This trace can
include information about the communications between services, but can also include
details about internal components used within the services (e.g. EJBs, database calls,
etc).
First point is that if we were to record duration metrics for each 'span' captured
(i.e. scope within which a task is performed), for each invocation of an application, then
it would result in a large amount of data that may be of no interest. So we need to find a
way for end users/developers to express which key points within an application they do
want recorded as metrics.
The proposal is to allow the application/services to define a tag/property on the spans of
interest, e.g. 'kpi', that would indicate to the server that the duration value
for the span should be stored in H-Metrics. The value for the tag should define the
name/description of the KPI.
If considered a suitable solution, then we can also propose it as a standard tag in the
OpenTracing standard.
There are a couple of metrics that we could record by default, first being the trace
instance completion time, and the second possibly being the individual service response
times (although this could potentially also be governed by the 'kpi' tag).
Thoughts?
Regards
Gary