but it would give you a lot of really easy management of your
workflow. Each of the stream processors consume a stream and do
one thing (e.g., aggregate the last 60 seconds of raw data, or
aggregate the last 60 minutes of either the raw data or the
60-second windows, etc.). The different outputs can still get
written to whatever storage you want; stream-oriented processing
just changes *how* you process the incoming data.
An alternative approach is to use Apache Kafka directly.
There are pros and cons, but the benefit is that the services
that do the computations would be microservices (no, really -
just really small, single-threaded processes that do a single
thing) that could be easily deployed across the cluster. If
anyone is interested this approach, ping me and I’ll point you
to a prototype that does this (not for analytics).
BTW, a stream-processing approach does not limit you to live
data. In fact, quite the opposite. Many people use stream
processing for ingesting large volumes of live data, but lots of
other people use it in “big data” as an alternative to batch
processing (often map-reduce).
_______________________________________________
hawkular-dev mailing list
hawkular-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hawkular-dev