[infinispan-dev] on preload and state integration

Galder Zamarreño galder at redhat.com
Thu Aug 30 03:51:57 EDT 2012


On Aug 29, 2012, at 7:27 PM, Mircea Markus wrote:

> Hi,
> 
> This is not high prio, but it is something I've come across several time now.
> Both the preload and the state transfer code, in order to add a batch of entries into the cache use the following pattern:
> 
> for (InternalCacheEntry e : state) {
>   cache.getAdvancedCache()
>         .withFlags(CACHE_MODE_LOCAL, SKIP_OWNERSHIP_CHECK, SKIP_CACHE_STORE,SKIP_REMOTE_LOOKUP, SKIP_INDEXING)
>                       .put(...);
> }
> 
> This has some flaws:
> - it is not the fastest way of inserting stuff into the cache. E.g. in the case of preload it would be much simpler to simply drain the data straight into the container. Similar for a cache store.
> - it pollutes the rest of the code unnecessarily. E.g. at the moment when we preload data, the topology information might not be available simply because the cache has not started yet. So all the interceptors that handle the put during preload need to somehow guard against using the topology information: e.g. StateTransferInterceptor, EntryWrappingInterceptor, LockingInterceptor etc.

^ Just add some info, the problem with only draining is that the put calls would go through the interceptor chain and so would avoid indexing (when it's required), and so would break this functionality:
https://issues.jboss.org/browse/ISPN-2087

> Now this approach is in use mainly because, during state insertion, we need to (re)use some logic which is present in the interceptors. 
> An different approach to handling state integration would be to move that reusable logic (where's needed) into corresponding managers and invoke methods on the manager directly instead of passing everything through the interceptor stack (users interested in these insertion can register listeners). I'm not sure that this would work with the extending modules?
> 
> Wdyt?

To be honest, to maximise performance, we should have some way of streaming, or passing a group of entries to a consumer, as already suggested in:
https://issues.jboss.org/browse/ISPN-2087?focusedCommentId=12713751&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12713751

So, a good way to separate this, taking in account indexing use case, would be:

- Query code that deals with indexing normal put(k,v)
- Query code that deals with a collection/stream of put(k,v)

The former is what people call through the normal interceptor chain, and the latter is the batching coming from state transfer/preload.

> 
> Cheers,
> Mircea   
> 
> 
> 
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache




More information about the infinispan-dev mailing list