Re: [Hawkular-dev] metrics on the bus

Wednesday, 17 August 2016

One of the contentions that I have with PR-568 is that introduces more
failures points for data prior to reaching Alerts. Publishing all data
directly is a simple proposition: data comes in, is persisted to Cassandra,
and at the same time sent via JMS. The PR introduces multiple additional
failure points and a failure in the Metrics will go unnoticed. For example,
what if the filtering mechanism all of a sudden crashed, what then? What if
the data being filtered does not match the expectations from Alerts; as in
Alerts requested data for a metric id to be sent but Metrics lost track of
that and does not report data for that metric id.

Going back to the replies from Randall, in order for PR-568 to be an
alternative to what is done today, we will need to design a lot of
additional features to get the same level of delivery confidence and
guarantee that we have today (without the PR).

https://github.com/hawkular/hawkular-metrics/pull/568

Thank you,
Stefan Negrea

On Wed, Aug 17, 2016 at 4:45 PM, Jay Shaughnessy <jshaughn(a)redhat.com&gt;
wrote:

...

 +1.  Although Randall is right, there is definitely a chance of
 inconsistency between what is persisted and what is processed by
 alerting, I think it's acceptable for our purposes.  In general users
 have historically accepted that server downtime can result in missed
 alerts.  Moreover, almost all of the alerting scenarios involve behavior
 over time.

 On 8/17/2016 5:44 AM, Michael Burman wrote:
 > Hi,
 >
 > Storing to Cassandra and JMS is not atomic as Cassandra does not provide
 transactions and especially not 2PC. So they're two different writes and
 can always result in inconsistency, no matter the secondary transport
 protocol. Also, is alerts even capable of handling all the possible crash
 scenarios? And do we even care about such a small window of potential data
 loss to the alerting engine in the case of a crash (which will take down
 both metrics & alerts on that node) ? We don't provide strict consistency
 with default metrics setting either, defaulting to one node acknowledges in
 Cassandra. There are multiple theoretical scenarios where we could in multi
 node scenario lose data or get inconsistencies.
 >
 > I think these are acceptable however for our use case. Even assuming we
 would lose one "node down" datapoint, that same situation probably persist
 for the next datapoint -> alert triggers, if you lose one metric datapoint
 from a bucket the calculated averages or percentiles etc only suffer a
 minor precision imperfection. Not to mention that almost everything in
 monitoring is already a discrete information sampled at certain point of
 time and not a continuous real value, so precision is lost before it even
 arrives to us.
 >
 > For those reasons I'd say these "problems" are more academical,
without
 any real world implications in this domain.
 >
 >    - Micke
 >
 > ----- Original Message -----
 > From: "Randall Hauch" <rhauch(a)redhat.com&gt;
 > To: "Discussions around Hawkular development" <
 hawkular-dev(a)lists.jboss.org&gt;
 > Sent: Tuesday, August 16, 2016 7:42:56 PM
 > Subject: Re: [Hawkular-dev] metrics on the bus
 >
 > I agree that the distributed system is probably more fault tolerant when
 using JMS than putting everything into a single app and forgoing JMS.
 >
 > BTW, does metrics write data to Cassandra and publish to JMS atomically?
 If not, that’s also a window for failure that might result in data loss.
 Something to consider if Hawkular requires complete consistency and can’t
 afford data loss.
 >
 >> On Aug 16, 2016, at 11:08 AM, John Sanda <jsanda(a)redhat.com&gt; wrote:
 >>
 >> With the JMS solution we have in place right now, data points are
 published after they have persisted in Cassandra. We can certainly keep
 that same behavior.
 >>
 >>> On Aug 16, 2016, at 11:49 AM, Randall Hauch <rhauch(a)redhat.com&gt;
wrote:
 >>>
 >>> Sorry, I’ve been lurking. One thing to consider is how each approach
 handles failures. For example, what happens if the system crashes after
 processed by metrics but before alerts picks it up? Will the system become
 inconsistent or will some events be lost before alerts sees them?
 >>>
 >>> Really, in order for the system to be completely fault tolerant, each
 component has to be completely atomic. Components that use “dual writes”
 (e.g., write to one system, then write to another outside of a larger
 transaction) will always be subject to losing data/events during a very
 inopportune failure. Not only that, a system comprised of multiple
 components that individually are safe might still be subject to losing
 data/events.
 >>>
 >>> I hope this is helpful.
 >>>
 >>> Randall
 >>>
 >>>> On Aug 16, 2016, at 10:25 AM, John Sanda <jsanda(a)redhat.com&gt;
wrote:
 >>>>
 >>>> I considered clustering before making the suggestion.
 MetricDataListener listens to a JMS topic for data points. When it receives
 data points, it passes those data points to AlertsEngine which in turn
 writes the data points into an ISPN, distributed cache. And then it looks
 like those data point get processed via a cache entry listener in
 AlertsEngineImpl. If I understand this data flow correctly, then I think it
 will work just as well if not better in a single WAR. Rather than getting
 notifications from a JMS topic, MetricDataListener can receive
 notifications from an Observable that pushes data point as they received in
 client requests. Metrics will also subscribe to that same Observable so
 that it can persist the data points. The fact that alerts is using a
 distributed cache works to our advantage here because it provides a
 mechanism for distributing data across nodes.
 >>>>
 >>>>> On Aug 16, 2016, at 3:29 AM, Lucas Ponce <lponce(a)redhat.com&gt;
wrote:
 >>>>>
 >>>>> This is a big point.
 >>>>>
 >>>>> I can see pros and cons on it.
 >>>>>
 >>>>> First thing it comes to me is that metrics has a stateless nature
 meanwhile alerts is stateful.
 >>>>>
 >>>>> So a first coupling would work for a single node but when we want
to
 scale our troubles can start as the design in clustered scenarios is
 completely different and a single .war won't help IMO.
 >>>>>
 >>>>> I don't think our current design is bad, in the context of the
 HAWKULAR-1102 and working in a demand publishing draft we are addressing
 the business issues that triggered this discussion.
 >>>>>
 >>>>> But I would like to hold this topic for a future architecture face
 to face meeting, to discuss it from all angles as we did on Madrid.
 >>>>>
 >>>>> (Counting with a face to face meeting in a reasonable timeframe, of
 course).
 >>>>>
 >>>>> Lucas
 >>>>>
 >>>>> ----- Mensaje original -----
 >>>>>> De: "John Sanda" <jsanda(a)redhat.com&gt;
 >>>>>> Para: "Discussions around Hawkular development" <
 hawkular-dev(a)lists.jboss.org&gt;
 >>>>>> Enviados: Lunes, 15 de Agosto 2016 16:45:28
 >>>>>> Asunto: Re: [Hawkular-dev] metrics on the bus
 >>>>>>
 >>>>>> We use JMS in large part because metrics and alerts are in
separate
 WARs (I
 >>>>>> realize JMS is used for other purposes, but I am speaking
strictly
 about
 >>>>>> this scenario). Why not deploy metrics and alerts in the same
WAR
 and
 >>>>>> altogether bypass JMS? As data points are ingested, we
broadcast
 them using
 >>>>>> an Rx subject to which both metrics and alerts subscribe. We
could
 do this
 >>>>>> is in away that still keeps metrics and alerts decoupled as
they
 are today.
 >>>>>> We would also have the added benefit of having a stand alone
 deployment for
 >>>>>> metrics and alerts.
 >>>>>>
 >>>>>>
 >>>>>>
 >>>>>>
 >>>>>> On Aug 10, 2016, at 9:37 AM, Jay Shaughnessy <
jshaughn(a)redhat.com
 > wrote:
 >>>>>>
 >>>>>>
 >>>>>> Yes, in fact I should have made it more clear that this whole
 discussion is
 >>>>>> bounded by H Metrics and H Alerting in the H Services context,
so
 limiting
 >>>>>> this to HS/Bus integration code is what we'd want to do.
 >>>>>>
 >>>>>> On 8/10/2016 4:06 AM, Heiko W.Rupp wrote:
 >>>>>>
 >>>>>>
 >>>>>>
 >>>>>> Someone remind me please.
 >>>>>>
 >>>>>> That bus-sender in/or hawkular-metrics is not an
 >>>>>> internal detail of metrics, but rather sort of
 >>>>>> 'external add-on'?
 >>>>>>
 >>>>>> If so, the logic to filter (or create many subscriptions)
 >>>>>> could go into it and would not touch the core metrics.
 >>>>>> Metrics would (as it does today) forward all new data-
 >>>>>> points into this sender and the sender can then decide
 >>>>>> how to proceed.
 >>>>>>
 >>>>>> _______________________________________________
 >>>>>> hawkular-dev mailing list hawkular-dev(a)lists.jboss.org
 >>>>>> https://lists.jboss.org/mailman/listinfo/hawkular-dev
 >>>>>>
 >>>>>> _______________________________________________
 >>>>>> hawkular-dev mailing list
 >>>>>> hawkular-dev(a)lists.jboss.org
 >>>>>> https://lists.jboss.org/mailman/listinfo/hawkular-dev
 >>>>>>
 >>>>>>
 >>>>>> _______________________________________________
 >>>>>> hawkular-dev mailing list
 >>>>>> hawkular-dev(a)lists.jboss.org
 >>>>>> https://lists.jboss.org/mailman/listinfo/hawkular-dev
 >>>>>>
 >>>>> _______________________________________________
 >>>>> hawkular-dev mailing list
 >>>>> hawkular-dev(a)lists.jboss.org
 >>>>> https://lists.jboss.org/mailman/listinfo/hawkular-dev
 >>>>
 >>>> _______________________________________________
 >>>> hawkular-dev mailing list
 >>>> hawkular-dev(a)lists.jboss.org
 >>>> https://lists.jboss.org/mailman/listinfo/hawkular-dev
 >>>
 >>> _______________________________________________
 >>> hawkular-dev mailing list
 >>> hawkular-dev(a)lists.jboss.org
 >>> https://lists.jboss.org/mailman/listinfo/hawkular-dev
 >>
 >> _______________________________________________
 >> hawkular-dev mailing list
 >> hawkular-dev(a)lists.jboss.org
 >> https://lists.jboss.org/mailman/listinfo/hawkular-dev
 >
 > _______________________________________________
 > hawkular-dev mailing list
 > hawkular-dev(a)lists.jboss.org
 > https://lists.jboss.org/mailman/listinfo/hawkular-dev
 >
 > _______________________________________________
 > hawkular-dev mailing list
 > hawkular-dev(a)lists.jboss.org
 > https://lists.jboss.org/mailman/listinfo/hawkular-dev

 _______________________________________________
 hawkular-dev mailing list
 hawkular-dev(a)lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/hawkular-dev

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [Hawkular-dev] metrics on the bus