On Sep 29, 2009, at 12:24 PM, Manik Surtani wrote:


On 29 Sep 2009, at 10:19, Mircea Markus wrote:


On Sep 29, 2009, at 12:08 PM, Manik Surtani wrote:


On 29 Sep 2009, at 09:57, Mircea Markus wrote:

Hi,

Again, this is a feature from Coherence[1].

Basic idea is to execute a query against the cache, and hold the  
result object. This result object will always have up to date  
query result; this means that whenever something is modified in  
the cache the result itself is updated. Advantage: if one performs  
the same query very often(e.g. several times every millisecond)  
the response will be fast and the system will not be overloaded.

Is it really faster?  Surely all you save is the construction of  
the various query objects, but the query itself would have to be re-
run every time.  Or does it attach a listener to the cache and  
check whether any new additions/removals should be used to update  
the result set?
this is the way it works. It is a sort of a near-cache, just that  
instead of being invalidated it is updated whenever the cache is  
updated. The documentation also suggests that they are using  
listeners.
I don't see how that could be much faster though.
I think it might be if the you are running *the same query* tons of  
times. Basically you don't do a map-reduce on all the nodes, but  
rather on every insertion (especially if the number of insertion is  
relative small compared to the number of same-query-bring-run) you  
updated (if necessary) the cached query result.

Hmm.  It would be pretty use-case-specific.  
I think there are many usecases for this(from coherence doc):
  • It is an ideal building block for Complex Event Processing (CEP) systems and event correlation engines.

  • It is ideal for situations in which an application repeats a particular query, and would benefit from always having instant access to the up-to-date result of that query.

  • A Continuous Query Cache is analogous to a materialized view, and is useful for accessing and manipulating the results of a query using the standard NamedCache API, and receiving an ongoing stream of events related to that query.

  • It's hard to see how this  
    _generally_ performs better, since you need to make sure you are aware  
    of all changes happening all over the cluster to keep this result set  
    up to date (REPL-style scalability bottleneck!)
    yes, the performance questions would definitely be in the case of DIST. Even for this, one way of handling it is to migrate the queries on each node, on put, the node would determine weather it should replicate to a certain node's(the query owner) near cache.  These would reduce the replication overhead to a minimum.
    Cheers
    --
    Manik Surtani
    manik@jboss.org
    Lead, Infinispan
    Lead, JBoss Cache
    http://www.infinispan.org
    http://www.jbosscache.org




    _______________________________________________
    infinispan-dev mailing list
    infinispan-dev@lists.jboss.org
    https://lists.jboss.org/mailman/listinfo/infinispan-dev