[Hawkular-dev] Low-impact clients and never dropping any events
Thomas Heute
theute at redhat.com
Thu Feb 19 03:51:46 EST 2015
On 02/18/2015 04:17 PM, Randall Hauch wrote:
>
>> On Feb 18, 2015, at 6:50 AM, Heiko Braun <ike.braun at googlemail.com
>> <mailto:ike.braun at googlemail.com>> wrote:
>>
>>
>>> On 18 Feb 2015, at 13:43, John Sanda <jsanda at redhat.com
>>> <mailto:jsanda at redhat.com>> wrote:
>>>
>>> I think that Spark's streaming API, particularly the window
>>> operations, could be an effective way to do computations in real
>>> time as data as ingested
>>
>> +1
>>
>> not only for processing the streams, but also for any kind of post
>> processing needed. plus it would supply the abstractions to run
>> computations across large number of nodes.
>>
>
> Exactly. Use Spark Streaming or even Storm would increase the
> installation and operational complexity
We have a good experience of users being turned down by installation and
operational complexity, so no matter the "but" part (even though that
sounds interesting), we would need to find a proper solution to remove
it / reduce it.
It needs to be easy to install in small environments and able to scale
when needed/wanted. Scaling by adding homogeneous nodes would help.
I have no experience with Spark/Storm, what is the burden on
installation and operational complexity ?
Thomas
> but it would give you a lot of really easy management of your
> workflow. Each of the stream processors consume a stream and do one
> thing (e.g., aggregate the last 60 seconds of raw data, or aggregate
> the last 60 minutes of either the raw data or the 60-second windows,
> etc.). The different outputs can still get written to whatever storage
> you want; stream-oriented processing just changes *how* you process
> the incoming data.
>
> An alternative approach is to use Apache Kafka directly. There are
> pros and cons, but the benefit is that the services that do the
> computations would be microservices (no, really - just really small,
> single-threaded processes that do a single thing) that could be easily
> deployed across the cluster. If anyone is interested this approach,
> ping me and I’ll point you to a prototype that does this (not for
> analytics).
>
> BTW, a stream-processing approach does not limit you to live data. In
> fact, quite the opposite. Many people use stream processing for
> ingesting large volumes of live data, but lots of other people use it
> in “big data” as an alternative to batch processing (often map-reduce).
>
>
>
> _______________________________________________
> hawkular-dev mailing list
> hawkular-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hawkular-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/hawkular-dev/attachments/20150219/6e871e48/attachment.html
More information about the hawkular-dev
mailing list