Hi James,
By specifying the LuceneCacheLoader as a loader for the default cache, it will added to
both the "lucene-index" (where it is needed) and the other two caches
(lucene-metadata and lucene-locks) - where I don't think it is needed. I think it
should only be configured for the "lucene-index" cache and removed from the
default config.
On top of that you might want to add the cluster cache loader *before* the
LuceneCacheLoader, otherwise it will always be the LuceneCacheLoader that would be queried
first. The config I have in mind is[1], would you mind giving it a try?
[1]
Not sure if I've done exactly what you had in mind... here is my
updated XML:
https://www.refheap.com/paste/12601
I added the loader to the lucene-index namedCache, which is the one
I'm using for distribution.
This didn't appear to change anything, as far as I can see. Still
seeing a lot of disk IO with every request.
James.
On 15 March 2013 15:54, Ray Tsang <saturnism(a)gmail.com> wrote:
> Can you try adding a ClusterCacheLoader to see if that helps?
>
> Thanks,
>
>
> On Fri, Mar 15, 2013 at 8:49 AM, James Aley <james.aley(a)swiftkey.net> wrote:
>>
>> Apologies - forgot to copy list.
>>
>> On 15 March 2013 15:48, James Aley <james.aley(a)swiftkey.net> wrote:
>>> Hey Adrian,
>>>
>>> Thanks for the response. I was chatting to Sanne on IRC yesterday, and
>>> he suggested this to me. Actually the logging I attached was from a
>>> cluster of 4 servers with numOwners=2. Sorry, I should have mentioned
>>> this actually, but I thought seeing as it didn't appear to make any
>>> difference that I'd just keep things simple in my previous email.
>>>
>>> While it seemed not to make a difference in this case... I can see why
>>> that would make sense. In future tests I guess I should probably stick
>>> with numOwners > 1.
>>>
>>>
>>> James.
>>>
>>> On 15 March 2013 15:44, Adrian Nistor <anistor(a)redhat.com> wrote:
>>>> Hi James,
>>>>
>>>> I'm not an expert on InfinispanDirectory but I've noticed in [1]
that
>>>> the
>>>> lucene-index cache is distributed with numOwners = 1. That means each
>>>> cache
>>>> entry is owned by just one cluster node and there's nowhere else to
go
>>>> in
>>>> the cluster if the key is not available in local memory, thus it needs
>>>> fetching from the cache store. This can be solved with numOwners > 1.
>>>> Please let me know if this solves your problem.
>>>>
>>>> Cheers!
>>>>
>>>>
>>>> On 03/15/2013 05:03 PM, James Aley wrote:
>>>>>
>>>>> Hey all,
>>>>>
>>>>> <OT>
>>>>> Seeing as this is my first post, I wanted to just quickly thank you
>>>>> all for Infinispan. So far I'm really enjoying working with it -
great
>>>>> product!
>>>>> </OT>
>>>>>
>>>>> I'm using the InfinispanDirectory for a Lucene project at the
moment.
>>>>> We use Lucene directly to build a search product, which has high
read
>>>>> requirements and likely very large indexes. I'm hoping to make
use of
>>>>> a distribution mode cache to keep the whole index in memory across a
>>>>> cluster of machines (the index will be too big for one server).
>>>>>
>>>>> The problem I'm having is that after loading a filesystem-based
Lucene
>>>>> directory into InfinispanDirectory via LuceneCacheLoader, no nodes
are
>>>>> retrieving data from the cluster - they instead look up keys in
their
>>>>> local CacheLoaders, which involves lots of disk I/O and is very
slow.
>>>>> I was hoping to just use the CacheLoader to initialize the caches,
but
>>>>> from there on read only from RAM (and network, of course). Is this
>>>>> supported? Maybe I've misunderstood the purpose of the
CacheLoader?
>>>>>
>>>>> To explain my observations in a little more detail:
>>>>> * I start a cluster of two servers, using [1] as the cache config.
>>>>> Both have a local copy of the Lucene index that will be loaded into
>>>>> the InfinispanDirectory via the loader. This is a test
configuration,
>>>>> where I've set numOwners=1 so that I only need two servers for
>>>>> distribution to happen.
>>>>> * Upon startup, things look good. I see the memory usage of the JVM
>>>>> reflect a pretty near 50/50 split of the data across both servers.
>>>>> Logging indicates both servers are in the cluster view, all seems
>>>>> fine.
>>>>> * When I send a search query to either one of the nodes, I notice
the
>>>>> following:
>>>>> - iotop shows huge (~100MB/s) disk I/O on that node alone from the
>>>>> JVM process.
>>>>> - no change in network activity between nodes (~300b/s, same as
>>>>> when
>>>>> idle)
>>>>> - memory usage on the node running the query increases
>>>>> dramatically,
>>>>> and stays higher even after the query is finished.
>>>>>
>>>>> So it seemed to me like each node was favouring use of the
CacheLoader
>>>>> to retrieve keys that are not in memory, instead of using the
cluster.
>>>>> Does that seem reasonable? Is this the expected behaviour?
>>>>>
>>>>> I started to investigate this by turning on trace logging, in this
>>>>> made me think perhaps the cause was that the CacheLoader's
interceptor
>>>>> is higher priority in the chain than the the distribution
interceptor?
>>>>> I'm not at all familiar with the design in any level of detail -
just
>>>>> what I picked up in the last 24 hours from browsing the code, so I
>>>>> could easily be way off. I've attached the log snippets I
thought
>>>>> relevant in [2].
>>>>>
>>>>> Any advice offered much appreciated.
>>>>> Thanks!
>>>>>
>>>>> James.
>>>>>
>>>>>
>>>>> [1]
https://www.refheap.com/paste/12531
>>>>> [2]
https://www.refheap.com/paste/12543
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev(a)lists.jboss.org
>>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>
>>>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev(a)lists.jboss.org
>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev