[infinispan-dev] event processing integration

yavuz gokirmak ygokirmak at gmail.com
Tue Mar 18 04:23:00 EDT 2014


Hi All,
it will interested to give cep capabilities to infinispan caches,
I have some comments,

On 17 March 2014 13:00, Jonathan Halliday <jonathan.halliday at redhat.com>wrote:

>
> Alongside recent talk of integrating infinispan with hadoop batch
> processing, there has been some discussion of using the data grid
> alongside an event stream processing system.
>
> There are several directions we could consider here. In approximate
> order of increasing complexity these are:
>
> - Allow bi-directional flow of events, such that listeners on the cache
> can be used to cause events in the processing engine, or events in the
> processing engine can update the cache.
>

To catch events from cache, I propose to develop a simple infinispanSource
for flume ( http://flume.apache.org ),
using this infinispanSource, one can listen any cache for updates or
inserts and redirect this events to either a cep engine or other
destination.

Updating cache will be similar, we may have a infinispanSink for flume and
if any application that needs to update any cache via sending events, he
can use infinispanSink in its application.

Actually, developing such flume components we will have a change data
capture tool ( http://en.wikipedia.org/wiki/Change_data_capture ) for
infinispan. CDC tools are vital for complex event processing integrations
and I think this will be a good starting point.


>
> - Allow the cache to be used to hold lookup data for reference from user
> code running the processing engine, to speed up joining streamed events
> to what would otherwise be data tables on disk.
>
> Actually it is important to cache some rdms table into memory in such
systems and sync this cache periodically from rdms table to be up-to-date.
I think this requirement can be achived via infinispan's "cache loader"s .



> - Integrate with the processing engine itself, such that infinispan can
> be used to store items that would otherwise occupy precious RAM.  This
> one is probably only viable with the cooperation of the stream
> processing system, so I'll base further discussion on Drools Fusion.
>
> The engine uses memory for a) rules, i.e. processing logic. Some of this
> is infrequently accessed. Think of a decision tree in which some
> branches are traversed more than others. So, opportunities to swap bits
> out to cache perhaps.  b) state, particularly sliding windows. Again
> some data is infrequently accessed. For many sliding window calculations
> in particular (e.g. running average), only the head and tail of the
> window are actually used. The events in-between can be swapped out.
>

Holding states are the most important case. In this requirement off-heap
cache will be a must.  ( Ben Coton is implementing Peter Lawrey's hugemaps
into infinispan for off-heap cache you may know )


>
> Of course these integrations require the stream processing engine to be
> written to support such operations - careful handling of object
> references is needed. Currently the engine doesn't work that way -
> everything is focussed on speed at the expense of memory.



>
>
- Borrow some ideas from the event processing DSLs, such that the data
> grid query engine can independently support continuous (standing)
> queries rather than just one-off queries. Arguably this is reinventing
> the wheel, but for simple use cases it may be preferable to run the
> stream processing logic directly in the grid rather than deploying a
> dedicated event stream processing system.

I think it's probably going to
> require supporting lists as a first class construct alongside maps
> though.   There are various cludges possible here, including the brute
> force approach of faking continuous query by re-executing a one-off
> query on each mutation, but they tend to be inefficient. There is also
> the thorny problem of supporting a (potentially distributed) clock,
> since a lot of use cases need to reference the passage of time in the
> query e.g. 'send event to listener if avg in last N minutes > x'.
>
>
>
> regards

Yavuz Gökırmak

   - tr.linkedin.com/pub/yavuz-gokirmak/20/a11/23b/


Jonathan Halliday
> Core developer, JBoss.
>
> --
> Registered in England and Wales under Company Registration No. 03798903
> Directors: Michael Cunningham (USA), Paul Hickey (Ireland), Matt Parson
> (USA), Charlie Peters (USA)
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20140318/bfc7dc61/attachment-0001.html 


More information about the infinispan-dev mailing list