Can you try adding a ClusterCacheLoader to see if that helps?<div><br></div><div>Thanks,<br><br><div class="gmail_quote">On Fri, Mar 15, 2013 at 8:49 AM, James Aley <span dir="ltr">&lt;<a href="mailto:james.aley@swiftkey.net" target="_blank">james.aley@swiftkey.net</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Apologies - forgot to copy list.<br>

<br>

On 15 March 2013 15:48, James Aley &lt;<a href="mailto:james.aley@swiftkey.net">james.aley@swiftkey.net</a>&gt; wrote:<br>

&gt; Hey Adrian,<br>

&gt;<br>

&gt; Thanks for the response. I was chatting to Sanne on IRC yesterday, and<br>

&gt; he suggested this to me. Actually the logging I attached was from a<br>

&gt; cluster of 4 servers with numOwners=2. Sorry, I should have mentioned<br>

&gt; this actually, but I thought seeing as it didn&#39;t appear to make any<br>

&gt; difference that I&#39;d just keep things simple in my previous email.<br>

&gt;<br>

&gt; While it seemed not to make a difference in this case... I can see why<br>

&gt; that would make sense. In future tests I guess I should probably stick<br>

&gt; with numOwners &gt; 1.<br>

&gt;<br>

&gt;<br>

&gt; James.<br>

<div class="HOEnZb"><div class="h5">&gt;<br>

&gt; On 15 March 2013 15:44, Adrian Nistor &lt;<a href="mailto:anistor@redhat.com">anistor@redhat.com</a>&gt; wrote:<br>

&gt;&gt; Hi James,<br>

&gt;&gt;<br>

&gt;&gt; I&#39;m not an expert on InfinispanDirectory but I&#39;ve noticed in [1] that the<br>

&gt;&gt; lucene-index cache is distributed with numOwners = 1. That means each cache<br>

&gt;&gt; entry is owned by just one cluster node and there&#39;s nowhere else to go in<br>

&gt;&gt; the cluster if the key is not available in local memory, thus it needs<br>

&gt;&gt; fetching from the cache store. This can be solved with numOwners &gt; 1.<br>

&gt;&gt; Please let me know if this solves your problem.<br>

&gt;&gt;<br>

&gt;&gt; Cheers!<br>

&gt;&gt;<br>

&gt;&gt;<br>

&gt;&gt; On 03/15/2013 05:03 PM, James Aley wrote:<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; Hey all,<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; &lt;OT&gt;<br>

&gt;&gt;&gt; Seeing as this is my first post, I wanted to just quickly thank you<br>

&gt;&gt;&gt; all for Infinispan. So far I&#39;m really enjoying working with it - great<br>

&gt;&gt;&gt; product!<br>

&gt;&gt;&gt; &lt;/OT&gt;<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; I&#39;m using the InfinispanDirectory for a Lucene project at the moment.<br>

&gt;&gt;&gt; We use Lucene directly to build a search product, which has high read<br>

&gt;&gt;&gt; requirements and likely very large indexes. I&#39;m hoping to make use of<br>

&gt;&gt;&gt; a distribution mode cache to keep the whole index in memory across a<br>

&gt;&gt;&gt; cluster of machines (the index will be too big for one server).<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; The problem I&#39;m having is that after loading a filesystem-based Lucene<br>

&gt;&gt;&gt; directory into InfinispanDirectory via LuceneCacheLoader, no nodes are<br>

&gt;&gt;&gt; retrieving data from the cluster - they instead look up keys in their<br>

&gt;&gt;&gt; local CacheLoaders, which involves lots of disk I/O and is very slow.<br>

&gt;&gt;&gt; I was hoping to just use the CacheLoader to initialize the caches, but<br>

&gt;&gt;&gt; from there on read only from RAM (and network, of course). Is this<br>

&gt;&gt;&gt; supported? Maybe I&#39;ve misunderstood the purpose of the CacheLoader?<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; To explain my observations in a little more detail:<br>

&gt;&gt;&gt; * I start a cluster of two servers, using [1] as the cache config.<br>

&gt;&gt;&gt; Both have a local copy of the Lucene index that will be loaded into<br>

&gt;&gt;&gt; the InfinispanDirectory via the loader. This is a test configuration,<br>

&gt;&gt;&gt; where I&#39;ve set numOwners=1 so that I only need two servers for<br>

&gt;&gt;&gt; distribution to happen.<br>

&gt;&gt;&gt; * Upon startup, things look good. I see the memory usage of the JVM<br>

&gt;&gt;&gt; reflect a pretty near 50/50 split of the data across both servers.<br>

&gt;&gt;&gt; Logging indicates both servers are in the cluster view, all seems<br>

&gt;&gt;&gt; fine.<br>

&gt;&gt;&gt; * When I send a search query to either one of the nodes, I notice the<br>

&gt;&gt;&gt; following:<br>

&gt;&gt;&gt;    - iotop shows huge (~100MB/s) disk I/O on that node alone from the<br>

&gt;&gt;&gt; JVM process.<br>

&gt;&gt;&gt;    - no change in network activity between nodes (~300b/s, same as when<br>

&gt;&gt;&gt; idle)<br>

&gt;&gt;&gt;    - memory usage on the node running the query increases dramatically,<br>

&gt;&gt;&gt; and stays higher even after the query is finished.<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; So it seemed to me like each node was favouring use of the CacheLoader<br>

&gt;&gt;&gt; to retrieve keys that are not in memory, instead of using the cluster.<br>

&gt;&gt;&gt; Does that seem reasonable? Is this the expected behaviour?<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; I started to investigate this by turning on trace logging, in this<br>

&gt;&gt;&gt; made me think perhaps the cause was that the CacheLoader&#39;s interceptor<br>

&gt;&gt;&gt; is higher priority in the chain than the the distribution interceptor?<br>

&gt;&gt;&gt; I&#39;m not at all familiar with the design in any level of detail - just<br>

&gt;&gt;&gt; what I picked up in the last 24 hours from browsing the code, so I<br>

&gt;&gt;&gt; could easily be way off. I&#39;ve attached the log snippets I thought<br>

&gt;&gt;&gt; relevant in [2].<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; Any advice offered much appreciated.<br>

&gt;&gt;&gt; Thanks!<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; James.<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; [1] <a href="https://www.refheap.com/paste/12531" target="_blank">https://www.refheap.com/paste/12531</a><br>

&gt;&gt;&gt; [2] <a href="https://www.refheap.com/paste/12543" target="_blank">https://www.refheap.com/paste/12543</a><br>

&gt;&gt;&gt; _______________________________________________<br>

&gt;&gt;&gt; infinispan-dev mailing list<br>

&gt;&gt;&gt; <a href="mailto:infinispan-dev@lists.jboss.org">infinispan-dev@lists.jboss.org</a><br>

&gt;&gt;&gt; <a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" target="_blank">https://lists.jboss.org/mailman/listinfo/infinispan-dev</a><br>

&gt;&gt;<br>

&gt;&gt;<br>

_______________________________________________<br>

infinispan-dev mailing list<br>

<a href="mailto:infinispan-dev@lists.jboss.org">infinispan-dev@lists.jboss.org</a><br>

<a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" target="_blank">https://lists.jboss.org/mailman/listinfo/infinispan-dev</a><br>

</div></div></blockquote></div><br></div>