[aerogear-dev] Notification Delivery metrics and processing with Kafka

Tue May 16 07:58:25 EDT 2017

Hi,

with the new APNs HTTP/2 APIs, and our usage of Pushy, we are able to get a
way more finegrain knowledge if Apple did accept (for further processing)
or reject a messages, on a per device_token level!

For instance, if we have a push with 5000 targeted devices, we are now able
to say that 5 tokens, for instances failed, but APNs was happy to accept
push request for the other 4995 devices (Note: this does NOT mean they
actually arrive at the device, just that apple accepted them for further
processing).

Now, this, for APNs, gives us much more flexiblity handling our metrics!

In our code, here, we do read *each* token request from APNs in here:
https://github.com/aerogear/aerogear-unifiedpush-server/blob/20831d96196663349c96da6b5fe11aef65cacf59/push/sender/src/main/java/org/jboss/aerogear/unifiedpush/message/sender/apns/PushyApnsSender.java#L130-L147

So here, we could simply send the result, on a per token base, to a (Kafka)
topic, like:

...
if (pushNotificationResponse.isAccepted()) {
  logger.trace("Push notification for '{}' (payload={})", deviceToken,
pushNotificationResponse.getPushNotification().getPayload());

  producer.send(jobID, "Success"); // sends to "push_messages" topic
} else {
  final String rejectReason = pushNotificationResponse.getRejectionReason();
  logger.trace("Push Message has been rejected with reason: {}", rejectReason);

  producer.send(jobID, "Rejected"); // sends "push_messages" topic
...
}

Now, this sends all to one topic, and we could be using, somewhere, Kafka
Stream API, to perform some processing of the source, and calculate some
stats on that, like:

KStreamBuilder builder = new KStreamBuilder();

// read from the topic that contains all messages, for all jobs
final KStream<String, String> source = builder.stream("push_messages");

// some simple processing, and grouping by key, applying a predicate
and send to three "analytic" topic:

final KTable<String, Long> successCountsPerJob = source.filter((key,
value) -> value.equals("Success"))
  .groupByKey()
  .count("successMessagesPerJob");
successCountsPerJob.to(Serdes.String(), Serdes.Long(), "successMessagesPerJob");

final KTable<String, Long> failCountsPerJob = source.filter((key,
value) -> value.equals("Rejected"))
  .groupByKey()
  .count("failedMessagesPerJob");
failCountsPerJob.to(Serdes.String(), Serdes.Long(), "failedMessagesPerJob");

source.groupByKey()
  count("totalMessagesPerJob")
    .to(Serdes.String(), Serdes.Long(), "totalMessagesPerJob");

The above performs some functional processing of the single source of
truth, based on different assumptions.

If one would have a simple consumer on each of these three "analytic"
topics, a simple logging output would be:

2017-05-16 13:42:48,763 INFO  successMessagesPerJob: 2 - jobID: XXX
2017-05-16 13:42:48,764 INFO  totalMessagesPerJob: 3 - jobID: XXX
2017-05-16 13:42:48,764 INFO  failedMessagesPerJob: 1 - jobID: XXX

since for the GSoC we do have two students, working on Kafka and HBase
improvements for UPS, I wanted to share this quick prototype, as food for
thoughts.

Of course, each of these 'filtered' consumers could than eventually store
the result somewhere else.

With this approach, Kafka would be come the hub (or data pipeline) for our
metrics, with stream processing and different consumers to deal with the
results of interest

Any comments or other thoughts?

-Matthias

-- 
Matthias Wessendorf

blog: http://matthiaswessendorf.wordpress.com/
twitter: http://twitter.com/mwessendorf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/aerogear-dev/attachments/20170516/6b5d646d/attachment-0001.html