[aerogear-dev] Simplify the metrics for sanity

Matthias Wessendorf matzew at apache.org
Wed May 31 10:55:16 EDT 2017


+9001

On Wed, May 31, 2017 at 11:46 AM, Summers Pittman <supittma at redhat.com>
wrote:

>
>
> On May 30, 2017 3:23 PM, "Matthias Wessendorf" <matzew at apache.org> wrote:
>
>
>
> On Tue, May 30, 2017 at 2:06 PM, Summers Pittman <supittma at redhat.com>
> wrote:
>
>>
>>
>> On Wed, May 24, 2017 at 4:30 AM, Matthias Wessendorf <matzew at apache.org>
>> wrote:
>>
>>> Hi,
>>>
>>> we do have a problem w/ our current metrics processing. It's complicated
>>> (lot's of CDI events and two different JMS messaging approaches...) and
>>> also slow (JPQL/JDBC) and it does consume a lot of memory and processing
>>> time. This is leading to bugs (incorrect stats) and eventually causes down
>>> times, due to heavy processing.
>>>
>>> I'd like to dramatically simplify our metrics processing... to something
>>> like:
>>> Success -> could connect to 3rd party, to deliver tokens
>>> Failure -> something went wrong when talking to 3rd party service.
>>>
>>>
>>> Right now we do have metrics on push delivery:
>>> Pending -> the submission to the 3rd party provider is in flight
>>> Success -> we were able to connect, and could deliver *something*
>>> Failure -> something obvious, like invalid certificate (APNs), no
>>> connection to 3rd party possible, etc
>>>
>>> Besides that, we also do a count on targeted devices. I think there is
>>> not really a huge value. For instance if APNs rejects some tokens, we do
>>> not track those, we just show how many tokens our DB did find, not more. We
>>> don't show any of real interest. We could improve this (see below), but I
>>> doubt that the current implementation is able to handle this well.
>>>
>>> Also, on Android/FCM the numbers are even worse. We do, internally,
>>> leverage their topics, so we usually end up sending exactly one push to
>>> FCM, regardless of how many Android device-tokens we have in the DB. The
>>> counter says 1 (one), because the server did target one topic (not n
>>> devices).
>>>
>>> So, for now, I'd like to dramatically simplify the code, and go with the
>>> above Success/Failure solution.
>>>
>>> However, I honestly think in the long run, we should get something
>>> pluggable, that allows us to process the metrics independently, outside of
>>> the UPS code base. I think my previous Kafka mail is addressing this
>>> partially: The actual response and details about the push job should be
>>> logged to some Kafka system, and an independent process should be able to
>>> process those.
>>>
>>> This will give us much more freedom and flexibility. Perhaps also, in
>>> the future, we want some different stats, and something like Prometheus
>>> /Grafana:
>>> https://prometheus.io/docs/visualization/grafana/
>>>
>>> A more flexible system, with independent metrics 'calculation'
>>> processing will help us here.
>>>
>>> Any thoughts?
>>>
>>>
>> What if we remove the current metrics UI
>>
>
> For sanity, we are also simplifying the UI:
> https://issues.jboss.org/browse/AGPUSH-2090
>
>
>> and replace them with webhooks that emit events?
>>
>
> In the long run, I am open to anything else. I think I mainly care about
> the actual push delivery and the events that we will be submitting to a
> centralized data hub/pipeline, such as Kafka.
>
> From there, a consumer process (written in what ever language) can offer
> webhooks etc
>
>
>>   It lets us add events easily, somewhat simplifies debugging, and gives
>> integrators a lot more control and hooks into our process.  We can even
>> turn the current metrics into a microservice project as an example.
>>  (Doubly so when we get Keycloak broken out and properly integrated)
>>
>
> the overall idea is to break the server in to a more modular system:
> * push-sender.war
> * metrics-processor.war (or jar)
> * device-regitration.war
> * UI process
>
> I think decoupled keycloak would be also key to this, or what do you mean
> ?
>
> What I meant was it is easier to secure services with a decoupled
> keycloak.
>
>
>
>
>>
>>
>>> -Matthias
>>>
>>>
>>>
>>> --
>>> Matthias Wessendorf
>>>
>>> blog: http://matthiaswessendorf.wordpress.com/
>>> twitter: http://twitter.com/mwessendorfa
>>>
>>> _______________________________________________
>>> aerogear-dev mailing list
>>> aerogear-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/aerogear-dev
>>>
>>
>>
>> _______________________________________________
>> aerogear-dev mailing list
>> aerogear-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/aerogear-dev
>>
>
>
>
> --
> Matthias Wessendorf
>
> blog: http://matthiaswessendorf.wordpress.com/
> twitter: http://twitter.com/mwessendorf
>
> _______________________________________________
> aerogear-dev mailing list
> aerogear-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/aerogear-dev
>
>
>
> _______________________________________________
> aerogear-dev mailing list
> aerogear-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/aerogear-dev
>



-- 
Matthias Wessendorf

blog: http://matthiaswessendorf.wordpress.com/
twitter: http://twitter.com/mwessendorf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/aerogear-dev/attachments/20170531/97c9858e/attachment-0001.html 


More information about the aerogear-dev mailing list