[Hawkular-dev] Tag concept in metrics and alerts
Thomas Heute
theute at redhat.com
Wed Jun 24 11:35:00 EDT 2015
On 06/24/2015 04:54 PM, Gary Brown wrote:
>
>
> ----- Original Message -----
>>
>>
>> On 6/24/2015 9:13 AM, Gary Brown wrote:
>>>
>>> ----- Original Message -----
>>>> In alert context, the incoming data should match
>>>> org.hawkular.alerts.api.model.data
>>>> https://github.com/hawkular/hawkular-alerts/tree/master/hawkular-alerts-api/src/main/java/org/hawkular/alerts/api/model/data
>>>> These data is carried and stored in the evaluation details for an
>>>> alert. But "Tag" in alerts is used as a way to put "alias" on
>>>> definitions but is not related to specific incoming data.
>>> Sorry when I referred to 'tag', I meant the tag in the metrics event, which
>>> is a map of name/value pairs.
>>>
>>> So when the alerts engine is analysing a metric, is the tag information on
>>> the metric stored in the alert's condition evaluation data?
>>
>> As you mentioned earlier, Tags are labels that can be attached ad-hoc to
>> Trigger definitions. They can then be used to filter queries, like,
>> "give me all of the unresolved alerts for triggers labeled XYZ". That
>> mechanism is for use by the clients, to be used in their own way.
>
> Yes, this is where the use of 'tag' in alerts and metrics differs, so might be good if Tag in Alerts could be renamed Label.
Do we need 1 word labelling at all or key/value pairs only ?
(BTW to make it more confusing OpenShift/Kubernetes use the term label
for key=value things)
Thomas
>
>>
>> When a trigger fires an alert it does so based on one or more sets of
>> true condition evaluations. One evaluation per condition on the
>> trigger. One set of condition evaluations for each repetition of the
>> conditions, based on dampening. By default, just one set of conditions
>> because with default dampening a trigger fires as soon as the conditions
>> match. Those evaluations are performed on data coming in. An alert
>> stores with it all of the evaluations and makes that available to the
>> actions (notifiers). In 0.2.0 it is only a toString of the
>> information. But in 0.3.0 it will be a json representation.
>>
>> Every datum has a dataId and some sort of value information. This is
>> likely where you would get your Tx Ids, and yes, it is captured and made
>> available. So perhaps this is what you are looking for.
>>
>
> Yes its possible - I need to try out the integration with metrics and alerts soon to make sure.
>
>>>
>>>> I have some questions, that may be can help us to refine your requeriment:
>>>>
>>>> - The incoming data you need is something like a pair of (BT id, value) =
>>>> (bt1, 1000), (bt2, 1002) ...
>>>>
>>>> - ¿ Where you want to apply conditions matching for specific BT id ?
>>> No, the condition wouldn't refer to the BT id - the id is only included for
>>> future reference to enable apps to link back to the business transaction
>>> that caused the alert.
>>>
>>>> - ¿ Or perhaps you have a more complex incoming data ?
>>>>
>>>> Can you ellaborate an example ?
>>> Sure - so the business transaction information may be used to derive a
>>> latency metric, describing the time delay between one service sending a
>>> message and the recipient receiving the message. The metric will include
>>> the BT id associated with the latency. The alerts engine will then apply a
>>> condition to determine whether the latency is greater than a permitted
>>> value, and if so create an alert.
>>>
>>> We then need to resulting Alert to directly or indirectly contain the BT
>>> id, so that when a user investigates the latency problem, they could
>>> navigate back to the business transaction that caused the problem to see
>>> if there is anything of interest.
>>>
>>> Now in reality this would not happen, as latency would probably be
>>> aggregated and only if the aggregate value hit a threshold would an alert
>>> be triggered - but it provides a potential example of the end to end flow
>>> of business txn data collection -> metrics -> alerts and then having the
>>> necessary info in the alert to trace back,
>>
>> So the issue we may have here is that it seems you may want some
>> supplemental contextual info on the data being sent in. If I understand
>> the above, you would likely define a Threshold condition. The dataId
>> (like a metricId, is known in advance, say "latencyMetric". The value
>> is the Double, and would be the actual latency. But you also have
>> runtimeInfo, the BTID, not known in advance, but something you want to
>> store with the datum. Is that correct?
>
> Yes that is correct.
>
>>
>> So, perhaps an optional free-format string that could be supplied?
>>
>
> May be better if it was a map. Then the information from the 'tags' on metrics, which are name/value pairs, could simply be copied over.
>
>> I have another question, would it make any sense for the threshold
>> matching to be performed completely outside of Alerts? Meaning, let BTM
>> determine if there is a situation completely on its own, and then just
>> hook into alerting for the use of trigger dampening, life cycle, action
>> invocations, etc? This is basically the ExternalCondition hooks I'm
>> currently working on. It allows an external system to basically do the
>> condition evaluation, in any way it sees fit. The data sent to alerts
>> is assumed to already describe a "true" evaluation.
>
> This sounds like it might be useful for evaluating events.
>
> Regards
> Gary
>
>>
>>> _______________________________________________
>>> hawkular-dev mailing list
>>> hawkular-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/hawkular-dev
>>
>>
>> _______________________________________________
>> hawkular-dev mailing list
>> hawkular-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/hawkular-dev
>>
>
> _______________________________________________
> hawkular-dev mailing list
> hawkular-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hawkular-dev
>
More information about the hawkular-dev
mailing list