[Hawkular-dev] HOSA and conversion from prometheus to hawkular metrics

Fri Feb 17 16:13:35 EST 2017

+1.  It seems to me that underlying metric ids are something we just 
want to hide as an implementation detail. Querying for a "family name" 
and narrowing by other tags gives you a useful set of TS.

On 2/17/2017 2:44 AM, Joel Takvorian wrote:
> For the curly braces in Grafana, I'm going to investigate.
>
> For your second point, I'm trying to put me in the shoes of an ops: if 
> I want to create a dashboard that shows a labelled metric (in term of 
> prometheus label), I'd like to see all its avatars in the same chart 
> to be able to compare them, see in what they converge or in what they 
> diverge. And maybe compare them in all pods of a given container name. 
> That would be queries with tags:
>
> Query tags:
> - container_name: something
> - family_name (or "metric_base_name", or whatever name we give to that 
> tag): what_i_ate
>
> I can't be 100% sure that it's going to be used, as people do what 
> they want in Grafana. But it seems interesting to me. The question is: 
> what's the cost of adding a tag? I believe metric tags are relatively 
> cheap in term of storage. So, having both "metric_name" 
> (what_i_ate{food=Banana}) and "family_name" (what_i_ate) would solve 
> all our issues, no?
>
>
>
> On Thu, Feb 16, 2017 at 6:23 PM, John Mazzitelli <mazz at redhat.com 
> <mailto:mazz at redhat.com>> wrote:
>
>     I need to resurrect this thread now that some others have had
>     experience with what we have - specifically, what Thomas reported
>     in this issue:
>     https://github.com/hawkular/hawkular-openshift-agent/issues/126
>     <https://github.com/hawkular/hawkular-openshift-agent/issues/126>
>
>     It has to do with Prometheus metrics and how HOSA names and tags
>     them in H-Metrics.
>
>     Just some quick background first:
>
>     Prometheus metrics have two parts - a "family name" (like
>     "http_response_count") and labels (like "method"). This means you
>     can have N metrics in Prometheus with the same metric family name
>     but each with different label values (like
>     "http_response_count{method=GET}" and
>     "http_response_count{method=POST}". Each unique combination of
>     family name plus label values represent a different set of time
>     series data (so http_response_count{method=GET} is one set of time
>     series data and http_response_count{method=POST} is another set of
>     time series data).
>
>     H-Metrics doesn't really have this concept of metric family.
>     H-Metrics has metric definitions each with unique names (or
>     "metric IDs") and a set of tags (h-metrics uses the name "tags"
>     rather than "labels"). In H-Metrics, you cannot have N metrics
>     with the same name (ID). You must have unique IDs to represent
>     different sets of time series data.
>
>     OK, with that quick intro, two things:
>
>     =====
>
>     1) Metrics coming from Prometheus by default will be stored in
>     H-Metrics with metric IDs like:
>
>        metric_family_name{label_name1=value1,label_name2=value2}
>
>     Basically, HOSA stores the H-Metric ID so it looks identical to
>     the metric data coming from Prometheus endpoints (name with labels
>     comma-separated and enclosed within curly braces).
>
>     But Grafana might have issues with the curly braces. However, the
>     original opinion when this was first implemented in HOSA was that
>     just using underscores in H-Metrics IDs, for example:
>
>        metric_family_name_label_name1_value1_label_name2_value2
>
>     will make querying from H-Metrics more difficult (it all looks
>     like one big name and it is hard to distinguish the labels in the
>     name).
>
>     QUESTION #1a: Does Grafana really have an issue with displaying
>     metrics whose names have curly braces - {} - and commas in them?
>     QUESTION #1b: If so, what should the default metric ID look like
>     when we have Prometheus labels like this, if not by using curly
>     braces and commas?
>
>     =====
>
>     2) These Prometheus metrics don't look right in the current
>     OpenShift UI. If we have two Prometheus metrics stored in
>     H-Metrics with the IDs:
>
>     what_i_ate{food=Banana}
>     what_i_ate{food=Apple}
>
>     what you see in the OpenShift UI console is two metric graphs each
>     with the same metric name "what_i_ate" - you don't know which ones
>     they are.
>
>     Why? Application metrics like these are now shown in the OpenShift
>     UI and it works fine even for Prometheus metrics UNLESS the
>     Prometheus metrics had labels (like the example above with
>     Prometheus labels food=Apple or food=Banana). This is because when
>     we tag these metrics in H-Metrics, one tag we add to the metric
>     definition is "metric_name" and for Prometheus the value of this
>     tag is the METRIC FAMILY name. This is what Joel was asking for
>     (see the last messages in this thread). But the OS UI console uses
>     this metric_name tag for the label of the graph (the full, real ID
>     of the metric is ugly to make sure its unique within the cluster -
>     e.g.
>     "pod/3e4553ew-34553d-345433-123a/custom/what_i_ate{food=Banana}" -
>     so we don't really want to show that to a user).
>
>     QUESTION #2a: Should I switch back and make metric_name be the
>     last part of the actual metric ID (not Prometheus family name)
>     like "what_i_ate{food=Banana}" so the OS UI console works? Or do
>     we fix the OS UI console to parse the full metric ID and only show
>     the last part (after the "/custom/" part) thus leaving
>     "metric_name" tag in H-Metrics be the Prometheus metric family
>     name and make querying easier (a-la Joel's suggestion).
>
>     QUESTION #2b: Is having metric family name a useful thing to have
>     as a H-Metric tag in the first place? If so, I will have to get
>     HOSA to create a new tag "base_metric_name" if "metric_name" is to
>     be fixed to get the OS UI to work. But does having the Prometheus
>     metric family name even a useful thing? Joel seemed to think so; I
>     would like to make sure it is a useful thing before I go and
>     implement this change.
>
>     ----- Forwarded Message -----
>     From: "John Mazzitelli" <mazz at redhat.com <mailto:mazz at redhat.com>>
>     To: "Discussions around Hawkular development"
>     <hawkular-dev at lists.jboss.org <mailto:hawkular-dev at lists.jboss.org>>
>     Sent: Wednesday, February 1, 2017 11:47:18 AM
>     Subject: Re: [Hawkular-dev] HOSA and conversion from prometheus to
>     hawkular metrics
>
>     https://github.com/hawkular/hawkular-openshift-agent/blob/master/deploy/openshift/hawkular-openshift-agent-configmap.yaml#L20
>     <https://github.com/hawkular/hawkular-openshift-agent/blob/master/deploy/openshift/hawkular-openshift-agent-configmap.yaml#L20>
>
>     :D
>
>     That's already there - the ${METRIC:name} resolves to the name of
>     the metric (not the new ID) and our default config puts that tag
>     on every metric.
>
>
>     ----- Original Message -----
>     >
>     > +1, if that is not being done I think it would good. Actually,
>     it's probably
>     > a good "best practice" as it make it easier to slice and dice
>     the data.
>     >
>     > On 2/1/2017 10:35 AM, Joel Takvorian wrote:
>     >
>     >
>     >
>     > +1
>     >
>     > Conversion based on labels seems more sane.
>     >
>     > I wonder if a new tag that recalls the prometheus metric name
>     would be
>     > useful; ex. "baseName=jvm_memory_pool_bytes_committed", to
>     retrieve all
>     > metrics of that family. Just an idea.
>     >
>     > On Wed, Feb 1, 2017 at 4:25 PM, John Mazzitelli <
>     mazz at redhat.com <mailto:mazz at redhat.com> > wrote:
>     >
>     >
>     > > Are you also tagging the Prometheus metrics with the labels?
>     >
>     > Yes, that is what was originally being done, and that is still
>     in there.
>     >
>     > ----- Original Message -----
>     > >
>     > > Mazz, this makes sense to me. Our decision to use unique ids
>     (well +type)
>     > > is
>     > > going to lead to this sort of thing. The ids are going to
>     basically be
>     > > large
>     > > concatenations of the tags that identify the data. Then,
>     additionally we're
>     > > going to have to tag the metrics with the same name/value
>     pairs that are
>     > > present in the id. Are you also tagging the Prometheus metrics
>     with the
>     > > labels?
>     > >
>     > > On 2/1/2017 9:38 AM, John Mazzitelli wrote:
>     > >
>     > >
>     > >
>     > > The past several days I've been working on an enhancement to
>     HOSA that came
>     > > in from the community (in fact, I would consider it a bug).
>     I'm about ready
>     > > to merge the PR [1] for this and do a HOSA 1.1.0.Final
>     release. I wanted to
>     > > post this to announce it and see if there is any feedback, too.
>     > >
>     > > Today, HOSA collects metrics from any Prometheus endpoint
>     which you declare
>     > > -
>     > > example:
>     > >
>     > > metrics
>     > > - name: go_memstats_sys_bytes
>     > > - name: process_max_fds
>     > > - name: process_open_fds
>     > >
>     > > But if a Prometheus metric has labels, Prometheus itself
>     considers each
>     > > metric with a unique combination of labels as an individual
>     time series
>     > > metric. This is different than how Hawkular Metric works -
>     each Hawkular
>     > > Metric metric ID (even if its metric definition or its
>     datapoints have
>     > > tags)
>     > > is a single time series metric. We need to account for this
>     difference. For
>     > > example, if our agent is configured with:
>     > >
>     > > metrics:
>     > > - name: jvm_memory_pool_bytes_committed
>     > >
>     > > And the Prometheus endpoint emits that metric with a label
>     called "pool"
>     > > like
>     > > this:
>     > >
>     > > jvm_memory_pool_bytes_committed{pool="Code Cache",} 2.7787264E7
>     > > jvm_memory_pool_bytes_committed{pool="PS Eden Space",} 2.3068672E7
>     > >
>     > > then to Prometheus this is actually 2 time series metrics (the
>     number of
>     > > bytes committed per pool type), not 1. Even though the metric
>     name is the
>     > > same (what Prometheus calls a "metric family name"), there are
>     two unique
>     > > combinations of labels - one with "Code Cache" and one with
>     "PS Eden Space"
>     > > - so they are 2 distinct time series metric data.
>     > >
>     > > Today, the agent only creates a single Hawkular-Metric in this
>     case, with
>     > > each datapoint tagged with those Prometheus labels on the
>     appropriate data
>     > > point. But we don't want to aggregate them like that since we
>     lose the
>     > > granularity that the Prometheus endpoint gives us (that is,
>     the number of
>     > > bytes committed in each pool type). I will say I think we
>     might be able to
>     > > get that granularity back through datapoint tag queries in
>     Hawkular-Metrics
>     > > but I don't know how well (if at all) that is supported and
>     how efficient
>     > > such queries would be even if supported, and how efficient
>     storage of these
>     > > metrics would be if we tag every data point with these labels
>     (not sure if
>     > > that is the general purpose of tags in H-Metrics). But,
>     regardless, the
>     > > fact
>     > > that these really are different time series metrics should
>     (IMO) be
>     > > represented as different time series metrics (via metric
>     definitions/metric
>     > > IDs) in Hawkular-Metrics.
>     > >
>     > > To support labeled Prometheus endpoint data like this, the
>     agent needs to
>     > > split this one named metric into N Hawkular-Metrics metrics
>     (where N is the
>     > > number of unique label combinations for that named metric). So
>     even though
>     > > the agent is configured with the one metric
>     > > "jvm_memory_pool_bytes_committed" we need to actually create two
>     > > Hawkular-Metric metric definitions (with two different and
>     unique metric
>     > > IDs
>     > > obviously).
>     > >
>     > > The PR [1] that is ready to go does this. By default it will
>     create
>     > > multiple
>     > > metric definitions/metric IDs in the form
>     > >
>     "metric-family-name{labelName1=labelValue1,labelName2=labelValue2,...}"
>     > > unless you want a different form in which case you can define
>     an "id" and
>     > > put in "${labelName}" in the ID you declare (such as
>     > > "${oneLabelName}_my_own_metric_name_${theOtherLabelName}" or
>     whatever). But
>     > > I suspect the default format will be what most people want and
>     thus nothing
>     > > needs to be done. In the above example, two metric definitions
>     with the
>     > > following IDs are created:
>     > >
>     > > 1. jvm_memory_pool_bytes_committed{pool=Code Cache}
>     > > 2. jvm_memory_pool_bytes_committed{pool=PS Eden Space}
>     > >
>     > > --John Mazz
>     > >
>     > > [1]
>     https://github.com/hawkular/hawkular-openshift-agent/pull/117
>     <https://github.com/hawkular/hawkular-openshift-agent/pull/117>
>     _______________________________________________
>     hawkular-dev mailing list
>     hawkular-dev at lists.jboss.org <mailto:hawkular-dev at lists.jboss.org>
>     https://lists.jboss.org/mailman/listinfo/hawkular-dev
>     <https://lists.jboss.org/mailman/listinfo/hawkular-dev>
>
>
>
>
> _______________________________________________
> hawkular-dev mailing list
> hawkular-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hawkular-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/hawkular-dev/attachments/20170217/cc3bc46c/attachment-0001.html