Certainly you can't base overall application health on some random aggregate avail.  It's an indicator, like so many other things, of where problems could lie.  I don't think there is anything wrong with a percentage as a quick indicator, from there you'd drill down as needed.  As Joel says, it depends also on what you choose to aggregate.  When your URL response times are degraded, likely firing alerts, you then want to understand why.  Aggregate avails could help answer the questions. There's always examples of how to misuse tools, a hammer can easily break your thumb, doesn't mean hammers are bad.


On 8/30/2016 11:45 AM, Joel Takvorian wrote:
Just a precision because I'm not sure if I was clear on that: the idea is to mix series based on a list of ids, or tags. Not *everything*

On Tue, Aug 30, 2016 at 5:39 PM, Joel Takvorian <jtakvori@redhat.com> wrote:
I agree that you won't want to mix everything, but you can still adopt some groupings that are meaningful, for instance group all front-end servers into a front-end availability series, and all back-ends into another series.

Moreover, once you get all the availability as ratio, it's easy to map to a binary availability if it's what you're looking for. The REST api will provide the data, then it's up to you to display what is the most relevant. I think ratio datapoints is an easy-to-use, yet complete, information.

Joel

On Tue, Aug 30, 2016 at 5:16 PM, Michael Burman <miburman@redhat.com> wrote:
Hi,

So if I have 8 MySQLs, 4 primaries, 4 replicas. One primary is down and the replica of that set is down as well. I request Availability of my datastore and I get 80% UP. If I had two replicas down instead, I would get 80% UP. There's a huge difference in these scenarios.

I'm not a fan of percents for that simple reason. Is my service up? Yes, it's 99% up, only all the front-end servers are down.. ugh.

  -  Micke

----- Original Message -----
From: "John Sanda" <jsanda@redhat.com>
To: "Discussions around Hawkular development" <hawkular-dev@lists.jboss.org>
Sent: Tuesday, August 30, 2016 4:11:07 PM
Subject: Re: [Hawkular-dev] Availability metrics: aggregate stats series

I like the idea of aggregated availabilities, but I don’t know that it can easily be simplified to up/down. Let’s say we have 3 Cassandra nodes deployed with replication_factor = 1.  If one node goes down we are at 66% availability.

> On Aug 29, 2016, at 3:24 AM, Joel Takvorian <jtakvori@redhat.com> wrote:
>
> Hello all,
>
> I'm still aiming to add some features to the grafana plugin. I've started to integrate availabilities, but now I'm facing a problem when it comes to show aggregated availabilities ; for example think about an OpenShift pod that is scaled up to several instances.
>
> Since availability is basically "up" or "down" (or, to simplify with the other states such as "unknown", say it's either "up" or "non-up"), I propose to add this new feature: availability stats with aggregation. The call would be parameterized with an aggregation method, which would be either "all of" or "any of": with "all of" we consider that the aggregated series is UP when all its parts are UP.
>
> It would require a new endpoint since the AvailabilityHandler currently only expose stats queries with metric id as query parameter - not suitable for multiple metrics.
>
> Any objection or remark for this feature?
>
> Joel
> _______________________________________________
> hawkular-dev mailing list
> hawkular-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hawkular-dev


_______________________________________________
hawkular-dev mailing list
hawkular-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hawkular-dev

_______________________________________________
hawkular-dev mailing list
hawkular-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hawkular-dev




_______________________________________________
hawkular-dev mailing list
hawkular-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hawkular-dev