[infinispan-dev] Continuous Query Caching

Mircea Markus mircea.markus at jboss.com
Tue Sep 29 06:56:35 EDT 2009


On Sep 29, 2009, at 12:24 PM, Manik Surtani wrote:

>
> On 29 Sep 2009, at 10:19, Mircea Markus wrote:
>
>>
>> On Sep 29, 2009, at 12:08 PM, Manik Surtani wrote:
>>
>>>
>>> On 29 Sep 2009, at 09:57, Mircea Markus wrote:
>>>
>>>> Hi,
>>>>
>>>> Again, this is a feature from Coherence[1].
>>>>
>>>> Basic idea is to execute a query against the cache, and hold the
>>>> result object. This result object will always have up to date
>>>> query result; this means that whenever something is modified in
>>>> the cache the result itself is updated. Advantage: if one performs
>>>> the same query very often(e.g. several times every millisecond)
>>>> the response will be fast and the system will not be overloaded.
>>>
>>> Is it really faster?  Surely all you save is the construction of
>>> the various query objects, but the query itself would have to be re-
>>> run every time.  Or does it attach a listener to the cache and
>>> check whether any new additions/removals should be used to update
>>> the result set?
>> this is the way it works. It is a sort of a near-cache, just that
>> instead of being invalidated it is updated whenever the cache is
>> updated. The documentation also suggests that they are using
>> listeners.
>>> I don't see how that could be much faster though.
>> I think it might be if the you are running *the same query* tons of
>> times. Basically you don't do a map-reduce on all the nodes, but
>> rather on every insertion (especially if the number of insertion is
>> relative small compared to the number of same-query-bring-run) you
>> updated (if necessary) the cached query result.
>
> Hmm.  It would be pretty use-case-specific.
I think there are many usecases for this(from coherence doc):
It is an ideal building block for Complex Event Processing (CEP)  
systems and event correlation engines.

It is ideal for situations in which an application repeats a  
particular query, and would benefit from always having instant access  
to the up-to-date result of that query.

A Continuous Query Cache is analogous to a materialized view, and is  
useful for accessing and manipulating the results of a query using the  
standard NamedCache API, and receiving an ongoing stream of events  
related to that query.

> It's hard to see how this
> _generally_ performs better, since you need to make sure you are aware
> of all changes happening all over the cluster to keep this result set
> up to date (REPL-style scalability bottleneck!)
yes, the performance questions would definitely be in the case of  
DIST. Even for this, one way of handling it is to migrate the queries  
on each node, on put, the node would determine weather it should  
replicate to a certain node's(the query owner) near cache.  These  
would reduce the replication overhead to a minimum.
> Cheers
> --
> Manik Surtani
> manik at jboss.org
> Lead, Infinispan
> Lead, JBoss Cache
> http://www.infinispan.org
> http://www.jbosscache.org
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20090929/7d306162/attachment-0002.html 


More information about the infinispan-dev mailing list