[Hawkular-dev] Availability metrics: aggregate stats series

Jay Shaughnessy jshaughn at redhat.com
Tue Aug 30 12:14:27 EDT 2016


Certainly you can't base overall application health on some random 
aggregate avail.  It's an indicator, like so many other things, of where 
problems could lie.  I don't think there is anything wrong with a 
percentage as a quick indicator, from there you'd drill down as needed.  
As Joel says, it depends also on what you choose to aggregate.  When 
your URL response times are degraded, likely firing alerts, you then 
want to understand why.  Aggregate avails could help answer the 
questions. There's always examples of how to misuse tools, a hammer can 
easily break your thumb, doesn't mean hammers are bad.


On 8/30/2016 11:45 AM, Joel Takvorian wrote:
> Just a precision because I'm not sure if I was clear on that: the idea 
> is to mix series based on a list of ids, or tags. Not *everything*
>
> On Tue, Aug 30, 2016 at 5:39 PM, Joel Takvorian <jtakvori at redhat.com 
> <mailto:jtakvori at redhat.com>> wrote:
>
>     I agree that you won't want to mix everything, but you can still
>     adopt some groupings that are meaningful, for instance group all
>     front-end servers into a front-end availability series, and all
>     back-ends into another series.
>
>     Moreover, once you get all the availability as ratio, it's easy to
>     map to a binary availability if it's what you're looking for. The
>     REST api will provide the data, then it's up to you to display
>     what is the most relevant. I think ratio datapoints is an
>     easy-to-use, yet complete, information.
>
>     Joel
>
>     On Tue, Aug 30, 2016 at 5:16 PM, Michael Burman
>     <miburman at redhat.com <mailto:miburman at redhat.com>> wrote:
>
>         Hi,
>
>         So if I have 8 MySQLs, 4 primaries, 4 replicas. One primary is
>         down and the replica of that set is down as well. I request
>         Availability of my datastore and I get 80% UP. If I had two
>         replicas down instead, I would get 80% UP. There's a huge
>         difference in these scenarios.
>
>         I'm not a fan of percents for that simple reason. Is my
>         service up? Yes, it's 99% up, only all the front-end servers
>         are down.. ugh.
>
>           -  Micke
>
>         ----- Original Message -----
>         From: "John Sanda" <jsanda at redhat.com <mailto:jsanda at redhat.com>>
>         To: "Discussions around Hawkular development"
>         <hawkular-dev at lists.jboss.org
>         <mailto:hawkular-dev at lists.jboss.org>>
>         Sent: Tuesday, August 30, 2016 4:11:07 PM
>         Subject: Re: [Hawkular-dev] Availability metrics: aggregate
>         stats series
>
>         I like the idea of aggregated availabilities, but I don’t know
>         that it can easily be simplified to up/down. Let’s say we have
>         3 Cassandra nodes deployed with replication_factor = 1.  If
>         one node goes down we are at 66% availability.
>
>         > On Aug 29, 2016, at 3:24 AM, Joel Takvorian
>         <jtakvori at redhat.com <mailto:jtakvori at redhat.com>> wrote:
>         >
>         > Hello all,
>         >
>         > I'm still aiming to add some features to the grafana plugin.
>         I've started to integrate availabilities, but now I'm facing a
>         problem when it comes to show aggregated availabilities ; for
>         example think about an OpenShift pod that is scaled up to
>         several instances.
>         >
>         > Since availability is basically "up" or "down" (or, to
>         simplify with the other states such as "unknown", say it's
>         either "up" or "non-up"), I propose to add this new feature:
>         availability stats with aggregation. The call would be
>         parameterized with an aggregation method, which would be
>         either "all of" or "any of": with "all of" we consider that
>         the aggregated series is UP when all its parts are UP.
>         >
>         > It would require a new endpoint since the
>         AvailabilityHandler currently only expose stats queries with
>         metric id as query parameter - not suitable for multiple metrics.
>         >
>         > Any objection or remark for this feature?
>         >
>         > Joel
>         > _______________________________________________
>         > hawkular-dev mailing list
>         > hawkular-dev at lists.jboss.org
>         <mailto:hawkular-dev at lists.jboss.org>
>         > https://lists.jboss.org/mailman/listinfo/hawkular-dev
>         <https://lists.jboss.org/mailman/listinfo/hawkular-dev>
>
>
>         _______________________________________________
>         hawkular-dev mailing list
>         hawkular-dev at lists.jboss.org <mailto:hawkular-dev at lists.jboss.org>
>         https://lists.jboss.org/mailman/listinfo/hawkular-dev
>         <https://lists.jboss.org/mailman/listinfo/hawkular-dev>
>
>         _______________________________________________
>         hawkular-dev mailing list
>         hawkular-dev at lists.jboss.org <mailto:hawkular-dev at lists.jboss.org>
>         https://lists.jboss.org/mailman/listinfo/hawkular-dev
>         <https://lists.jboss.org/mailman/listinfo/hawkular-dev>
>
>
>
>
>
> _______________________________________________
> hawkular-dev mailing list
> hawkular-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hawkular-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/hawkular-dev/attachments/20160830/7a6b23b8/attachment.html 


More information about the hawkular-dev mailing list