Hi,
with the new APNs HTTP/2 APIs, and our usage of Pushy, we are able to get a way more finegrain knowledge if Apple did accept (for further processing) or reject a messages, on a per device_token level!
For instance, if we have a push with 5000 targeted devices, we are now able to say that 5 tokens, for instances failed, but APNs was happy to accept push request for the other 4995 devices (Note: this does NOT mean they actually arrive at the device, just that apple accepted them for further processing).
Now, this, for APNs, gives us much more flexiblity handling our metrics!
In our code, here, we do read each token request from APNs in here: https://github.com/aerogear/aerogear-unifiedpush-server/blob/20831d96196663349c96da6b5fe11aef65cacf59/push/sender/src/main/java/org/jboss/aerogear/unifiedpush/message/sender/apns/PushyApnsSender.java#L130-L147
So here, we could simply send the result, on a per token base, to a (Kafka) topic, like:
...
if (pushNotificationResponse.isAccepted()) {
logger.trace("Push notification for '{}' (payload={})", deviceToken, pushNotificationResponse.getPushNotification().getPayload());
producer.send(jobID, "Success"); // sends to "push_messages" topic
} else {
final String rejectReason = pushNotificationResponse.getRejectionReason();
logger.trace("Push Message has been rejected with reason: {}", rejectReason);
producer.send(jobID, "Rejected"); // sends "push_messages" topic
...
}
Now, this sends all to one topic, and we could be using, somewhere, Kafka Stream API, to perform some processing of the source, and calculate some stats on that, like:
KStreamBuilder builder = new KStreamBuilder();
// read from the topic that contains all messages, for all jobs
final KStream<String, String> source = builder.stream("push_messages");
// some simple processing, and grouping by key, applying a predicate and send to three "analytic" topic:
final KTable<String, Long> successCountsPerJob = source.filter((key, value) -> value.equals("Success"))
.groupByKey()
.count("successMessagesPerJob");
successCountsPerJob.to(Serdes.String(), Serdes.Long(), "successMessagesPerJob");
final KTable<String, Long> failCountsPerJob = source.filter((key, value) -> value.equals("Rejected"))
.groupByKey()
.count("failedMessagesPerJob");
failCountsPerJob.to(Serdes.String(), Serdes.Long(), "failedMessagesPerJob");
source.groupByKey()
count("totalMessagesPerJob")
.to(Serdes.String(), Serdes.Long(), "totalMessagesPerJob");
The above performs some functional processing of the single source of truth, based on different assumptions.
If one would have a simple consumer on each of these three "analytic" topics, a simple logging output would be:
2017-05-16 13:42:48,763 INFO successMessagesPerJob: 2 - jobID: XXX
2017-05-16 13:42:48,764 INFO totalMessagesPerJob: 3 - jobID: XXX
2017-05-16 13:42:48,764 INFO failedMessagesPerJob: 1 - jobID: XXX
since for the GSoC we do have two students, working on Kafka and HBase improvements for UPS, I wanted to share this quick prototype, as food for thoughts.
Of course, each of these 'filtered' consumers could than eventually store the result somewhere else.
With this approach, Kafka would be come the hub (or data pipeline) for our metrics, with stream processing and different consumers to deal with the results of interest
Any comments or other thoughts?
-Matthias