[Hawkular-dev] HOSA and conversion from prometheus to hawkular metrics

Joel Takvorian jtakvori at redhat.com
Fri Feb 17 04:26:05 EST 2017


About the first point, just answered here
https://github.com/hawkular/hawkular-openshift-agent/issues/126
In short, there shouldn't be any problem with curly braces in Grafana.

On Fri, Feb 17, 2017 at 8:44 AM, Joel Takvorian <jtakvori at redhat.com> wrote:

> For the curly braces in Grafana, I'm going to investigate.
>
> For your second point, I'm trying to put me in the shoes of an ops: if I
> want to create a dashboard that shows a labelled metric (in term of
> prometheus label), I'd like to see all its avatars in the same chart to be
> able to compare them, see in what they converge or in what they diverge.
> And maybe compare them in all pods of a given container name. That would be
> queries with tags:
>
> Query tags:
> - container_name: something
> - family_name (or "metric_base_name", or whatever name we give to that
> tag): what_i_ate
>
> I can't be 100% sure that it's going to be used, as people do what they
> want in Grafana. But it seems interesting to me. The question is: what's
> the cost of adding a tag? I believe metric tags are relatively cheap in
> term of storage. So, having both "metric_name" (what_i_ate{food=Banana})
> and "family_name" (what_i_ate) would solve all our issues, no?
>
>
>
> On Thu, Feb 16, 2017 at 6:23 PM, John Mazzitelli <mazz at redhat.com> wrote:
>
>> I need to resurrect this thread now that some others have had experience
>> with what we have - specifically, what Thomas reported in this issue:
>> https://github.com/hawkular/hawkular-openshift-agent/issues/126
>>
>> It has to do with Prometheus metrics and how HOSA names and tags them in
>> H-Metrics.
>>
>> Just some quick background first:
>>
>> Prometheus metrics have two parts - a "family name" (like
>> "http_response_count") and labels (like "method"). This means you can have
>> N metrics in Prometheus with the same metric family name but each with
>> different label values (like "http_response_count{method=GET}" and
>> "http_response_count{method=POST}". Each unique combination of family
>> name plus label values represent a different set of time series data (so
>> http_response_count{method=GET} is one set of time series data and
>> http_response_count{method=POST} is another set of time series data).
>>
>> H-Metrics doesn't really have this concept of metric family. H-Metrics
>> has metric definitions each with unique names (or "metric IDs") and a set
>> of tags (h-metrics uses the name "tags" rather than "labels"). In
>> H-Metrics, you cannot have N metrics with the same name (ID). You must have
>> unique IDs to represent different sets of time series data.
>>
>> OK, with that quick intro, two things:
>>
>> =====
>>
>> 1) Metrics coming from Prometheus by default will be stored in H-Metrics
>> with metric IDs like:
>>
>>    metric_family_name{label_name1=value1,label_name2=value2}
>>
>> Basically, HOSA stores the H-Metric ID so it looks identical to the
>> metric data coming from Prometheus endpoints (name with labels
>> comma-separated and enclosed within curly braces).
>>
>> But Grafana might have issues with the curly braces. However, the
>> original opinion when this was first implemented in HOSA was that just
>> using underscores in H-Metrics IDs, for example:
>>
>>    metric_family_name_label_name1_value1_label_name2_value2
>>
>> will make querying from H-Metrics more difficult (it all looks like one
>> big name and it is hard to distinguish the labels in the name).
>>
>> QUESTION #1a: Does Grafana really have an issue with displaying metrics
>> whose names have curly braces - {} - and commas in them?
>> QUESTION #1b: If so, what should the default metric ID look like when we
>> have Prometheus labels like this, if not by using curly braces and commas?
>>
>> =====
>>
>> 2) These Prometheus metrics don't look right in the current OpenShift UI.
>> If we have two Prometheus metrics stored in H-Metrics with the IDs:
>>
>> what_i_ate{food=Banana}
>> what_i_ate{food=Apple}
>>
>> what you see in the OpenShift UI console is two metric graphs each with
>> the same metric name "what_i_ate" - you don't know which ones they are.
>>
>> Why? Application metrics like these are now shown in the OpenShift UI and
>> it works fine even for Prometheus metrics UNLESS the Prometheus metrics had
>> labels (like the example above with Prometheus labels food=Apple or
>> food=Banana). This is because when we tag these metrics in H-Metrics, one
>> tag we add to the metric definition is "metric_name" and for Prometheus the
>> value of this tag is the METRIC FAMILY name. This is what Joel was asking
>> for (see the last messages in this thread). But the OS UI console uses this
>> metric_name tag for the label of the graph (the full, real ID of the metric
>> is ugly to make sure its unique within the cluster - e.g.
>> "pod/3e4553ew-34553d-345433-123a/custom/what_i_ate{food=Banana}" - so we
>> don't really want to show that to a user).
>>
>> QUESTION #2a: Should I switch back and make metric_name be the last part
>> of the actual metric ID (not Prometheus family name) like
>> "what_i_ate{food=Banana}" so the OS UI console works? Or do we fix the OS
>> UI console to parse the full metric ID and only show the last part (after
>> the "/custom/" part) thus leaving "metric_name" tag in H-Metrics be the
>> Prometheus metric family name and make querying easier (a-la Joel's
>> suggestion).
>>
>> QUESTION #2b: Is having metric family name a useful thing to have as a
>> H-Metric tag in the first place? If so, I will have to get HOSA to create a
>> new tag "base_metric_name" if "metric_name" is to be fixed to get the OS UI
>> to work. But does having the Prometheus metric family name even a useful
>> thing? Joel seemed to think so; I would like to make sure it is a useful
>> thing before I go and implement this change.
>>
>> ----- Forwarded Message -----
>> From: "John Mazzitelli" <mazz at redhat.com>
>> To: "Discussions around Hawkular development" <
>> hawkular-dev at lists.jboss.org>
>> Sent: Wednesday, February 1, 2017 11:47:18 AM
>> Subject: Re: [Hawkular-dev] HOSA and conversion from prometheus to
>> hawkular metrics
>>
>> https://github.com/hawkular/hawkular-openshift-agent/blob/ma
>> ster/deploy/openshift/hawkular-openshift-agent-configmap.yaml#L20
>>
>> :D
>>
>> That's already there - the ${METRIC:name} resolves to the name of the
>> metric (not the new ID) and our default config puts that tag on every
>> metric.
>>
>>
>> ----- Original Message -----
>> >
>> > +1, if that is not being done I think it would good. Actually, it's
>> probably
>> > a good "best practice" as it make it easier to slice and dice the data.
>> >
>> > On 2/1/2017 10:35 AM, Joel Takvorian wrote:
>> >
>> >
>> >
>> > +1
>> >
>> > Conversion based on labels seems more sane.
>> >
>> > I wonder if a new tag that recalls the prometheus metric name would be
>> > useful; ex. "baseName=jvm_memory_pool_bytes_committed", to retrieve all
>> > metrics of that family. Just an idea.
>> >
>> > On Wed, Feb 1, 2017 at 4:25 PM, John Mazzitelli < mazz at redhat.com >
>> wrote:
>> >
>> >
>> > > Are you also tagging the Prometheus metrics with the labels?
>> >
>> > Yes, that is what was originally being done, and that is still in there.
>> >
>> > ----- Original Message -----
>> > >
>> > > Mazz, this makes sense to me. Our decision to use unique ids (well
>> +type)
>> > > is
>> > > going to lead to this sort of thing. The ids are going to basically be
>> > > large
>> > > concatenations of the tags that identify the data. Then, additionally
>> we're
>> > > going to have to tag the metrics with the same name/value pairs that
>> are
>> > > present in the id. Are you also tagging the Prometheus metrics with
>> the
>> > > labels?
>> > >
>> > > On 2/1/2017 9:38 AM, John Mazzitelli wrote:
>> > >
>> > >
>> > >
>> > > The past several days I've been working on an enhancement to HOSA
>> that came
>> > > in from the community (in fact, I would consider it a bug). I'm about
>> ready
>> > > to merge the PR [1] for this and do a HOSA 1.1.0.Final release. I
>> wanted to
>> > > post this to announce it and see if there is any feedback, too.
>> > >
>> > > Today, HOSA collects metrics from any Prometheus endpoint which you
>> declare
>> > > -
>> > > example:
>> > >
>> > > metrics
>> > > - name: go_memstats_sys_bytes
>> > > - name: process_max_fds
>> > > - name: process_open_fds
>> > >
>> > > But if a Prometheus metric has labels, Prometheus itself considers
>> each
>> > > metric with a unique combination of labels as an individual time
>> series
>> > > metric. This is different than how Hawkular Metric works - each
>> Hawkular
>> > > Metric metric ID (even if its metric definition or its datapoints have
>> > > tags)
>> > > is a single time series metric. We need to account for this
>> difference. For
>> > > example, if our agent is configured with:
>> > >
>> > > metrics:
>> > > - name: jvm_memory_pool_bytes_committed
>> > >
>> > > And the Prometheus endpoint emits that metric with a label called
>> "pool"
>> > > like
>> > > this:
>> > >
>> > > jvm_memory_pool_bytes_committed{pool="Code Cache",} 2.7787264E7
>> > > jvm_memory_pool_bytes_committed{pool="PS Eden Space",} 2.3068672E7
>> > >
>> > > then to Prometheus this is actually 2 time series metrics (the number
>> of
>> > > bytes committed per pool type), not 1. Even though the metric name is
>> the
>> > > same (what Prometheus calls a "metric family name"), there are two
>> unique
>> > > combinations of labels - one with "Code Cache" and one with "PS Eden
>> Space"
>> > > - so they are 2 distinct time series metric data.
>> > >
>> > > Today, the agent only creates a single Hawkular-Metric in this case,
>> with
>> > > each datapoint tagged with those Prometheus labels on the appropriate
>> data
>> > > point. But we don't want to aggregate them like that since we lose the
>> > > granularity that the Prometheus endpoint gives us (that is, the
>> number of
>> > > bytes committed in each pool type). I will say I think we might be
>> able to
>> > > get that granularity back through datapoint tag queries in
>> Hawkular-Metrics
>> > > but I don't know how well (if at all) that is supported and how
>> efficient
>> > > such queries would be even if supported, and how efficient storage of
>> these
>> > > metrics would be if we tag every data point with these labels (not
>> sure if
>> > > that is the general purpose of tags in H-Metrics). But, regardless,
>> the
>> > > fact
>> > > that these really are different time series metrics should (IMO) be
>> > > represented as different time series metrics (via metric
>> definitions/metric
>> > > IDs) in Hawkular-Metrics.
>> > >
>> > > To support labeled Prometheus endpoint data like this, the agent
>> needs to
>> > > split this one named metric into N Hawkular-Metrics metrics (where N
>> is the
>> > > number of unique label combinations for that named metric). So even
>> though
>> > > the agent is configured with the one metric
>> > > "jvm_memory_pool_bytes_committed" we need to actually create two
>> > > Hawkular-Metric metric definitions (with two different and unique
>> metric
>> > > IDs
>> > > obviously).
>> > >
>> > > The PR [1] that is ready to go does this. By default it will create
>> > > multiple
>> > > metric definitions/metric IDs in the form
>> > > "metric-family-name{labelName1=labelValue1,labelName2=
>> labelValue2,...}"
>> > > unless you want a different form in which case you can define an "id"
>> and
>> > > put in "${labelName}" in the ID you declare (such as
>> > > "${oneLabelName}_my_own_metric_name_${theOtherLabelName}" or
>> whatever). But
>> > > I suspect the default format will be what most people want and thus
>> nothing
>> > > needs to be done. In the above example, two metric definitions with
>> the
>> > > following IDs are created:
>> > >
>> > > 1. jvm_memory_pool_bytes_committed{pool=Code Cache}
>> > > 2. jvm_memory_pool_bytes_committed{pool=PS Eden Space}
>> > >
>> > > --John Mazz
>> > >
>> > > [1] https://github.com/hawkular/hawkular-openshift-agent/pull/117
>> _______________________________________________
>> hawkular-dev mailing list
>> hawkular-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/hawkular-dev
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/hawkular-dev/attachments/20170217/007acdfb/attachment-0001.html 


More information about the hawkular-dev mailing list