[Hawkular-dev] Tag concept in metrics and alerts
Jay Shaughnessy
jshaughn at redhat.com
Wed Jun 24 10:33:43 EDT 2015
On 6/24/2015 9:13 AM, Gary Brown wrote:
>
> ----- Original Message -----
>> In alert context, the incoming data should match
>> org.hawkular.alerts.api.model.data
>> https://github.com/hawkular/hawkular-alerts/tree/master/hawkular-alerts-api/src/main/java/org/hawkular/alerts/api/model/data
>> These data is carried and stored in the evaluation details for an
>> alert. But "Tag" in alerts is used as a way to put "alias" on
>> definitions but is not related to specific incoming data.
> Sorry when I referred to 'tag', I meant the tag in the metrics event, which is a map of name/value pairs.
>
> So when the alerts engine is analysing a metric, is the tag information on the metric stored in the alert's condition evaluation data?
As you mentioned earlier, Tags are labels that can be attached ad-hoc to
Trigger definitions. They can then be used to filter queries, like,
"give me all of the unresolved alerts for triggers labeled XYZ". That
mechanism is for use by the clients, to be used in their own way.
When a trigger fires an alert it does so based on one or more sets of
true condition evaluations. One evaluation per condition on the
trigger. One set of condition evaluations for each repetition of the
conditions, based on dampening. By default, just one set of conditions
because with default dampening a trigger fires as soon as the conditions
match. Those evaluations are performed on data coming in. An alert
stores with it all of the evaluations and makes that available to the
actions (notifiers). In 0.2.0 it is only a toString of the
information. But in 0.3.0 it will be a json representation.
Every datum has a dataId and some sort of value information. This is
likely where you would get your Tx Ids, and yes, it is captured and made
available. So perhaps this is what you are looking for.
>
>> I have some questions, that may be can help us to refine your requeriment:
>>
>> - The incoming data you need is something like a pair of (BT id, value) =
>> (bt1, 1000), (bt2, 1002) ...
>>
>> - ¿ Where you want to apply conditions matching for specific BT id ?
> No, the condition wouldn't refer to the BT id - the id is only included for future reference to enable apps to link back to the business transaction that caused the alert.
>
>> - ¿ Or perhaps you have a more complex incoming data ?
>>
>> Can you ellaborate an example ?
> Sure - so the business transaction information may be used to derive a latency metric, describing the time delay between one service sending a message and the recipient receiving the message. The metric will include the BT id associated with the latency. The alerts engine will then apply a condition to determine whether the latency is greater than a permitted value, and if so create an alert.
>
> We then need to resulting Alert to directly or indirectly contain the BT id, so that when a user investigates the latency problem, they could navigate back to the business transaction that caused the problem to see if there is anything of interest.
>
> Now in reality this would not happen, as latency would probably be aggregated and only if the aggregate value hit a threshold would an alert be triggered - but it provides a potential example of the end to end flow of business txn data collection -> metrics -> alerts and then having the necessary info in the alert to trace back,
So the issue we may have here is that it seems you may want some
supplemental contextual info on the data being sent in. If I understand
the above, you would likely define a Threshold condition. The dataId
(like a metricId, is known in advance, say "latencyMetric". The value
is the Double, and would be the actual latency. But you also have
runtimeInfo, the BTID, not known in advance, but something you want to
store with the datum. Is that correct?
So, perhaps an optional free-format string that could be supplied?
I have another question, would it make any sense for the threshold
matching to be performed completely outside of Alerts? Meaning, let BTM
determine if there is a situation completely on its own, and then just
hook into alerting for the use of trigger dampening, life cycle, action
invocations, etc? This is basically the ExternalCondition hooks I'm
currently working on. It allows an external system to basically do the
condition evaluation, in any way it sees fit. The data sent to alerts
is assumed to already describe a "true" evaluation.
> _______________________________________________
> hawkular-dev mailing list
> hawkular-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hawkular-dev
More information about the hawkular-dev
mailing list