[Hawkular-dev] HOSA and conversion from prometheus to hawkular metrics

Jay Shaughnessy jshaughn at redhat.com
Wed Feb 1 10:15:48 EST 2017


Mazz, this makes sense to me.  Our decision to use unique ids (well 
+type) is going to lead to this sort of thing.  The ids are going to 
basically be large concatenations of the tags that identify the data.  
Then, additionally we're going to have to tag the metrics with the same 
name/value pairs that are present in the id.  Are you also tagging the 
Prometheus metrics with the labels?

On 2/1/2017 9:38 AM, John Mazzitelli wrote:
> The past several days I've been working on an enhancement to HOSA that came in from the community (in fact, I would consider it a bug). I'm about ready to merge the PR [1] for this and do a HOSA 1.1.0.Final release. I wanted to post this to announce it and see if there is any feedback, too.
>
> Today, HOSA collects metrics from any Prometheus endpoint which you declare - example:
>
>     metrics
>     - name: go_memstats_sys_bytes
>     - name: process_max_fds
>     - name: process_open_fds
>
> But if a Prometheus metric has labels, Prometheus itself considers each metric with a unique combination of labels as an individual time series metric. This is different than how Hawkular Metric works - each Hawkular Metric metric ID (even if its metric definition or its datapoints have tags) is a single time series metric. We need to account for this difference. For example, if our agent is configured with:
>
>     metrics:
>     - name: jvm_memory_pool_bytes_committed
>
> And the Prometheus endpoint emits that metric with a label called "pool" like this:
>
>     jvm_memory_pool_bytes_committed{pool="Code Cache",} 2.7787264E7
>     jvm_memory_pool_bytes_committed{pool="PS Eden Space",} 2.3068672E7
>
> then to Prometheus this is actually 2 time series metrics (the number of bytes committed per pool type), not 1. Even though the metric name is the same (what Prometheus calls a "metric family name"), there are two unique combinations of labels - one with "Code Cache" and one with "PS Eden Space" - so they are 2 distinct time series metric data.
>
> Today, the agent only creates a single Hawkular-Metric in this case, with each datapoint tagged with those Prometheus labels on the appropriate data point. But we don't want to aggregate them like that since we lose the granularity that the Prometheus endpoint gives us (that is, the number of bytes committed in each pool type). I will say I think we might be able to get that granularity back through datapoint tag queries in Hawkular-Metrics but I don't know how well (if at all) that is supported and how efficient such queries would be even if supported, and how efficient storage of these metrics would be if we tag every data point with these labels (not sure if that is the general purpose of tags in H-Metrics). But, regardless, the fact that these really are different time series metrics should (IMO) be represented as different time series metrics (via metric definitions/metric IDs) in Hawkular-Metrics.
>
> To support labeled Prometheus endpoint data like this, the agent needs to split this one named metric into N Hawkular-Metrics metrics (where N is the number of unique label combinations for that named metric). So even though the agent is configured with the one metric "jvm_memory_pool_bytes_committed" we need to actually create two Hawkular-Metric metric definitions (with two different and unique metric IDs obviously).
>
> The PR [1] that is ready to go does this. By default it will create multiple metric definitions/metric IDs in the form "metric-family-name{labelName1=labelValue1,labelName2=labelValue2,...}" unless you want a different form in which case you can define an "id" and put in "${labelName}" in the ID you declare (such as "${oneLabelName}_my_own_metric_name_${theOtherLabelName}" or whatever). But I suspect the default format will be what most people want and thus nothing needs to be done. In the above example, two metric definitions with the following IDs are created:
>
> 1. jvm_memory_pool_bytes_committed{pool=Code Cache}
> 2. jvm_memory_pool_bytes_committed{pool=PS Eden Space}
>
> --John Mazz
>
> [1] https://github.com/hawkular/hawkular-openshift-agent/pull/117
>
> _______________________________________________
> hawkular-dev mailing list
> hawkular-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hawkular-dev
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/hawkular-dev/attachments/20170201/4001f470/attachment.html 


More information about the hawkular-dev mailing list