[aerogear-dev] Simplify the metrics for sanity

Summers Pittman supittma at redhat.com
Tue May 30 08:06:21 EDT 2017


On Wed, May 24, 2017 at 4:30 AM, Matthias Wessendorf <matzew at apache.org>
wrote:

> Hi,
>
> we do have a problem w/ our current metrics processing. It's complicated
> (lot's of CDI events and two different JMS messaging approaches...) and
> also slow (JPQL/JDBC) and it does consume a lot of memory and processing
> time. This is leading to bugs (incorrect stats) and eventually causes down
> times, due to heavy processing.
>
> I'd like to dramatically simplify our metrics processing... to something
> like:
> Success -> could connect to 3rd party, to deliver tokens
> Failure -> something went wrong when talking to 3rd party service.
>
>
> Right now we do have metrics on push delivery:
> Pending -> the submission to the 3rd party provider is in flight
> Success -> we were able to connect, and could deliver *something*
> Failure -> something obvious, like invalid certificate (APNs), no
> connection to 3rd party possible, etc
>
> Besides that, we also do a count on targeted devices. I think there is not
> really a huge value. For instance if APNs rejects some tokens, we do not
> track those, we just show how many tokens our DB did find, not more. We
> don't show any of real interest. We could improve this (see below), but I
> doubt that the current implementation is able to handle this well.
>
> Also, on Android/FCM the numbers are even worse. We do, internally,
> leverage their topics, so we usually end up sending exactly one push to
> FCM, regardless of how many Android device-tokens we have in the DB. The
> counter says 1 (one), because the server did target one topic (not n
> devices).
>
> So, for now, I'd like to dramatically simplify the code, and go with the
> above Success/Failure solution.
>
> However, I honestly think in the long run, we should get something
> pluggable, that allows us to process the metrics independently, outside of
> the UPS code base. I think my previous Kafka mail is addressing this
> partially: The actual response and details about the push job should be
> logged to some Kafka system, and an independent process should be able to
> process those.
>
> This will give us much more freedom and flexibility. Perhaps also, in the
> future, we want some different stats, and something like Prometheus
> /Grafana:
> https://prometheus.io/docs/visualization/grafana/
>
> A more flexible system, with independent metrics 'calculation' processing
> will help us here.
>
> Any thoughts?
>
>
What if we remove the current metrics UI and replace them with webhooks
that emit events?  It lets us add events easily, somewhat simplifies
debugging, and gives integrators a lot more control and hooks into our
process.  We can even turn the current metrics into a microservice project
as an example.  (Doubly so when we get Keycloak broken out and properly
integrated)


> -Matthias
>
>
>
> --
> Matthias Wessendorf
>
> blog: http://matthiaswessendorf.wordpress.com/
> twitter: http://twitter.com/mwessendorfa
>
> _______________________________________________
> aerogear-dev mailing list
> aerogear-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/aerogear-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/aerogear-dev/attachments/20170530/e666a5fb/attachment-0001.html 


More information about the aerogear-dev mailing list