Re: [Hawkular-dev] Low-impact clients and never dropping any events

Thursday, 19 February 2015

On 02/18/2015 04:17 PM, Randall Hauch wrote:
...

> On Feb 18, 2015, at 6:50 AM, Heiko Braun <ike.braun(a)googlemail.com 
> <mailto:ike.braun@googlemail.com>> wrote:
>
>
>> On 18 Feb 2015, at 13:43, John Sanda <jsanda(a)redhat.com 
>> <mailto:jsanda@redhat.com>> wrote:
>>
>> I think that Spark's streaming API, particularly the window 
>> operations, could be an effective way to do computations in real 
>> time as data as ingested
>
> +1
>
> not only for processing the streams, but also for any kind of post 
> processing needed. plus it would supply the abstractions to run 
> computations across large number of nodes.
>

 Exactly. Use Spark Streaming or even Storm would increase the 
 installation and operational complexity We have a good experience of users being
turned down by installation and 
operational complexity, so no matter the "but" part (even though that 
sounds interesting), we would need to find a proper solution to remove 
it / reduce it.

It needs to be easy to install in small environments and able to scale 
when needed/wanted. Scaling by adding homogeneous nodes would help.

I have no experience with Spark/Storm, what is the burden on 
installation and operational complexity ?

Thomas

...
 but it would give you a lot of really easy management of your 
 workflow. Each of the stream processors consume a stream and do one 
 thing (e.g., aggregate the last 60 seconds of raw data, or aggregate 
 the last 60 minutes of either the raw data or the 60-second windows, 
 etc.). The different outputs can still get written to whatever storage 
 you want; stream-oriented processing just changes *how* you process 
 the incoming data.

 An alternative approach is to use Apache Kafka directly. There are 
 pros and cons, but the benefit is that the services that do the 
 computations would be microservices (no, really - just really small, 
 single-threaded processes that do a single thing) that could be easily 
 deployed across the cluster. If anyone is interested this approach, 
 ping me and I’ll point you to a prototype that does this (not for 
 analytics).

 BTW, a stream-processing approach does not limit you to live data. In 
 fact, quite the opposite. Many people use stream processing for 
 ingesting large volumes of live data, but lots of other people use it 
 in “big data” as an alternative to batch processing (often map-reduce).

 _______________________________________________
 hawkular-dev mailing list
 hawkular-dev(a)lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/hawkular-dev 

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [Hawkular-dev] Low-impact clients and never dropping any events