[Hawkular-dev] HOSA and conversion from prometheus to hawkular metrics

John Mazzitelli mazz at redhat.com
Wed Feb 1 10:25:16 EST 2017


> Are you also tagging the Prometheus metrics with the labels?

Yes, that is what was originally being done, and that is still in there.

----- Original Message -----
> 
> Mazz, this makes sense to me. Our decision to use unique ids (well +type) is
> going to lead to this sort of thing. The ids are going to basically be large
> concatenations of the tags that identify the data. Then, additionally we're
> going to have to tag the metrics with the same name/value pairs that are
> present in the id. Are you also tagging the Prometheus metrics with the
> labels?
> 
> On 2/1/2017 9:38 AM, John Mazzitelli wrote:
> 
> 
> 
> The past several days I've been working on an enhancement to HOSA that came
> in from the community (in fact, I would consider it a bug). I'm about ready
> to merge the PR [1] for this and do a HOSA 1.1.0.Final release. I wanted to
> post this to announce it and see if there is any feedback, too.
> 
> Today, HOSA collects metrics from any Prometheus endpoint which you declare -
> example:
> 
>    metrics
>    - name: go_memstats_sys_bytes
>    - name: process_max_fds
>    - name: process_open_fds
> 
> But if a Prometheus metric has labels, Prometheus itself considers each
> metric with a unique combination of labels as an individual time series
> metric. This is different than how Hawkular Metric works - each Hawkular
> Metric metric ID (even if its metric definition or its datapoints have tags)
> is a single time series metric. We need to account for this difference. For
> example, if our agent is configured with:
> 
>    metrics:
>    - name: jvm_memory_pool_bytes_committed
> 
> And the Prometheus endpoint emits that metric with a label called "pool" like
> this:
> 
>    jvm_memory_pool_bytes_committed{pool="Code Cache",} 2.7787264E7
>    jvm_memory_pool_bytes_committed{pool="PS Eden Space",} 2.3068672E7
> 
> then to Prometheus this is actually 2 time series metrics (the number of
> bytes committed per pool type), not 1. Even though the metric name is the
> same (what Prometheus calls a "metric family name"), there are two unique
> combinations of labels - one with "Code Cache" and one with "PS Eden Space"
> - so they are 2 distinct time series metric data.
> 
> Today, the agent only creates a single Hawkular-Metric in this case, with
> each datapoint tagged with those Prometheus labels on the appropriate data
> point. But we don't want to aggregate them like that since we lose the
> granularity that the Prometheus endpoint gives us (that is, the number of
> bytes committed in each pool type). I will say I think we might be able to
> get that granularity back through datapoint tag queries in Hawkular-Metrics
> but I don't know how well (if at all) that is supported and how efficient
> such queries would be even if supported, and how efficient storage of these
> metrics would be if we tag every data point with these labels (not sure if
> that is the general purpose of tags in H-Metrics). But, regardless, the fact
> that these really are different time series metrics should (IMO) be
> represented as different time series metrics (via metric definitions/metric
> IDs) in Hawkular-Metrics.
> 
> To support labeled Prometheus endpoint data like this, the agent needs to
> split this one named metric into N Hawkular-Metrics metrics (where N is the
> number of unique label combinations for that named metric). So even though
> the agent is configured with the one metric
> "jvm_memory_pool_bytes_committed" we need to actually create two
> Hawkular-Metric metric definitions (with two different and unique metric IDs
> obviously).
> 
> The PR [1] that is ready to go does this. By default it will create multiple
> metric definitions/metric IDs in the form
> "metric-family-name{labelName1=labelValue1,labelName2=labelValue2,...}"
> unless you want a different form in which case you can define an "id" and
> put in "${labelName}" in the ID you declare (such as
> "${oneLabelName}_my_own_metric_name_${theOtherLabelName}" or whatever). But
> I suspect the default format will be what most people want and thus nothing
> needs to be done. In the above example, two metric definitions with the
> following IDs are created:
> 
> 1. jvm_memory_pool_bytes_committed{pool=Code Cache}
> 2. jvm_memory_pool_bytes_committed{pool=PS Eden Space}
> 
> --John Mazz
> 
> [1] https://github.com/hawkular/hawkular-openshift-agent/pull/117
> _______________________________________________
> hawkular-dev mailing list hawkular-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hawkular-dev
> 
> 
> _______________________________________________
> hawkular-dev mailing list
> hawkular-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hawkular-dev
> 


More information about the hawkular-dev mailing list