[infinispan-dev] CacheLoaders, Distribution mode and Interceptors

Ray Tsang saturnism at gmail.com
Fri Mar 15 11:54:37 EDT 2013


Can you try adding a ClusterCacheLoader to see if that helps?

Thanks,

On Fri, Mar 15, 2013 at 8:49 AM, James Aley <james.aley at swiftkey.net> wrote:

> Apologies - forgot to copy list.
>
> On 15 March 2013 15:48, James Aley <james.aley at swiftkey.net> wrote:
> > Hey Adrian,
> >
> > Thanks for the response. I was chatting to Sanne on IRC yesterday, and
> > he suggested this to me. Actually the logging I attached was from a
> > cluster of 4 servers with numOwners=2. Sorry, I should have mentioned
> > this actually, but I thought seeing as it didn't appear to make any
> > difference that I'd just keep things simple in my previous email.
> >
> > While it seemed not to make a difference in this case... I can see why
> > that would make sense. In future tests I guess I should probably stick
> > with numOwners > 1.
> >
> >
> > James.
> >
> > On 15 March 2013 15:44, Adrian Nistor <anistor at redhat.com> wrote:
> >> Hi James,
> >>
> >> I'm not an expert on InfinispanDirectory but I've noticed in [1] that
> the
> >> lucene-index cache is distributed with numOwners = 1. That means each
> cache
> >> entry is owned by just one cluster node and there's nowhere else to go
> in
> >> the cluster if the key is not available in local memory, thus it needs
> >> fetching from the cache store. This can be solved with numOwners > 1.
> >> Please let me know if this solves your problem.
> >>
> >> Cheers!
> >>
> >>
> >> On 03/15/2013 05:03 PM, James Aley wrote:
> >>>
> >>> Hey all,
> >>>
> >>> <OT>
> >>> Seeing as this is my first post, I wanted to just quickly thank you
> >>> all for Infinispan. So far I'm really enjoying working with it - great
> >>> product!
> >>> </OT>
> >>>
> >>> I'm using the InfinispanDirectory for a Lucene project at the moment.
> >>> We use Lucene directly to build a search product, which has high read
> >>> requirements and likely very large indexes. I'm hoping to make use of
> >>> a distribution mode cache to keep the whole index in memory across a
> >>> cluster of machines (the index will be too big for one server).
> >>>
> >>> The problem I'm having is that after loading a filesystem-based Lucene
> >>> directory into InfinispanDirectory via LuceneCacheLoader, no nodes are
> >>> retrieving data from the cluster - they instead look up keys in their
> >>> local CacheLoaders, which involves lots of disk I/O and is very slow.
> >>> I was hoping to just use the CacheLoader to initialize the caches, but
> >>> from there on read only from RAM (and network, of course). Is this
> >>> supported? Maybe I've misunderstood the purpose of the CacheLoader?
> >>>
> >>> To explain my observations in a little more detail:
> >>> * I start a cluster of two servers, using [1] as the cache config.
> >>> Both have a local copy of the Lucene index that will be loaded into
> >>> the InfinispanDirectory via the loader. This is a test configuration,
> >>> where I've set numOwners=1 so that I only need two servers for
> >>> distribution to happen.
> >>> * Upon startup, things look good. I see the memory usage of the JVM
> >>> reflect a pretty near 50/50 split of the data across both servers.
> >>> Logging indicates both servers are in the cluster view, all seems
> >>> fine.
> >>> * When I send a search query to either one of the nodes, I notice the
> >>> following:
> >>>    - iotop shows huge (~100MB/s) disk I/O on that node alone from the
> >>> JVM process.
> >>>    - no change in network activity between nodes (~300b/s, same as when
> >>> idle)
> >>>    - memory usage on the node running the query increases dramatically,
> >>> and stays higher even after the query is finished.
> >>>
> >>> So it seemed to me like each node was favouring use of the CacheLoader
> >>> to retrieve keys that are not in memory, instead of using the cluster.
> >>> Does that seem reasonable? Is this the expected behaviour?
> >>>
> >>> I started to investigate this by turning on trace logging, in this
> >>> made me think perhaps the cause was that the CacheLoader's interceptor
> >>> is higher priority in the chain than the the distribution interceptor?
> >>> I'm not at all familiar with the design in any level of detail - just
> >>> what I picked up in the last 24 hours from browsing the code, so I
> >>> could easily be way off. I've attached the log snippets I thought
> >>> relevant in [2].
> >>>
> >>> Any advice offered much appreciated.
> >>> Thanks!
> >>>
> >>> James.
> >>>
> >>>
> >>> [1] https://www.refheap.com/paste/12531
> >>> [2] https://www.refheap.com/paste/12543
> >>> _______________________________________________
> >>> infinispan-dev mailing list
> >>> infinispan-dev at lists.jboss.org
> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >>
> >>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20130315/fe6c7ca4/attachment.html 


More information about the infinispan-dev mailing list