Hi all,
Thanks for the help with this issue. I thought I'd just clarify that
the situation is pretty much resolved (or worked around) for me now by
use of the clusterLoader. I'll watch the JIRA issue and be sure to try
again without a clusterLoader when that's taken care of at some point.
Best,
James.
On 20 March 2013 15:05, Mircea Markus <mmarkus(a)redhat.com> wrote:
FYI I've created a JIRA to track this:
https://issues.jboss.org/browse/ISPN-2950
Whilst quite a performance issues, I don't think that this is an critical/consistency
issue for async stores: by using an async store you might loose data (expect
inconsistencies) during a node crash anyway, so what this behaviour does is just to
increase the inconsistency window.
On 19 Mar 2013, at 16:30, Mircea Markus wrote:
>
> On 19 Mar 2013, at 16:15, Dan Berindei wrote:
>
>> Hi Sanne
>>
>> On Tue, Mar 19, 2013 at 4:12 PM, Sanne Grinovero <sanne(a)infinispan.org>
wrote:
>> Mircea,
>> what I was most looking forward was to you comment on the interceptor
>> order generated for DIST+cachestores
>> - we don't think the ClusteredCacheLoader should be needed at all
>>
>> Agree, ClusteredCacheLoader should not be necessary.
>>
>> James, if you're still seeing problems with numOwners=1, could you create an
issue in JIRA?
>>
>>
>> - each DIST node is loading from the CacheLoader (any) rather than
>> loading from its peer nodes for non-owned entries (!!)
>>
>>
>> Sometimes loading stuff from a local disk is faster than going remote, e.g. if
you have numOwners=2 and both owners have to load the same entry from disk and send it to
the originator twice.
> the staggering of remote gets should overcome that.
>>
>> Still, most of the time the entry is going to be in memory on the owner nodes, so
the local load is slower (especially with a shared cache store, where loading is over the
network as well).
> +1
>>
>>
>> This has come up on several threads now and I think it's critically
>> wrong, as I commented previously this also introduces many
>> inconsistencies - as far as I understand it.
>>
>>
>> Is there a JIRA for this already?
>>
>> Yes, loading a stale entry from the local cache store is definitely not a good
thing, but we actually delete the non-owned entries after the initial state transfer.
There may be some consistency issues if one uses a DIST_SYNC cache with a shared async
cache store, but fully sync configurations should be fine.
>>
>> OTOH, if the cache store is not shared, the chances of finding the entry in the
local store on a non-owner are slim to none, so it doesn't make sense to do the
lookup.
>>
>> Implementation-wise, just changing the interceptor order is probably not enough.
If the key doesn't exist in the cache, the CacheLoaderInterceptor will still try to
load it from the cache store after the remote lookup, so we'll need a marker in the
invocation context to avoid the extra cache store load.
> if the key does't map to the local node it should trigger a remote get to owners
(or allow the dist interceptor to do just that)
>> Actually, since this is just a performance issue, it could wait until we
implement tombstones everywhere.
> Hmm, not sure i see the correlation between this and tombstones?
>
>>
>> BTW your gist wouldn't work, the metadata cache needs to load certain
>> elements too. But nice you spotted the need to potentially filter what
>> "preload" means in the scope of each cache, as the metadata one should
>> only preload metadata, while in the original configuration this data
>> would indeed be duplicated.
>> Opened:
https://issues.jboss.org/browse/ISPN-2938
>>
>> Sanne
>>
>> On 19 March 2013 11:51, Mircea Markus <mmarkus(a)redhat.com> wrote:
>>>
>>> On 16 Mar 2013, at 01:19, Sanne Grinovero wrote:
>>>
>>>> Hi Adrian,
>>>> let's forget about Lucene details and focus on DIST.
>>>> With numOwners=1 and having two nodes the entries should be stored
>>>> roughly 50% on each node, I see nothing wrong with that
>>>> considering you don't need data failover in a read-only use case
>>>> having all the index available in the shared CacheLoader.
>>>>
>>>> In such a scenario, and having both nodes preloaded all data, in case
>>>> of a get() operation I would expect
>>>> either:
>>>> A) to be the owner, hence retrieve the value from local in-JVM reference
>>>> B) to not be the owner, so to forward the request to the other node
>>>> having roughly 50% chance per key to be in case A or B.
>>>>
>>>> But when hitting case B) it seems that instead of loading from the
>>>> other node, it hits the CacheLoader to fetch the value.
>>>>
>>>> I already had asked James to verify with 4 nodes and numOwners=2, the
>>>> result is the same so I suggested him to ask here;
>>>> BTW I think numOwners=1 is perfectly valid and should work as with
>>>> numOwners=1, the only reason I asked him to repeat
>>>> the test is that we don't have much tests on the numOwners=1 case
and
>>>> I was assuming there might be some (wrong) assumptions
>>>> affecting this.
>>>>
>>>> Note that this is not "just" a critical performance problem but
I'm
>>>> also suspecting it could provide inconsistent reads, in two classes of
>>>> problems:
>>>>
>>>> # non-shared CacheStore with stale entries
>>>> If for non-owned keys it will hit the local CacheStore first, where
>>>> you might expect to not find anything, so to forward the request to
>>>> the right node. What if this node has been the owner in the past? It
>>>> might have an old entry locally stored, which would be returned
>>>> instead of the correct value which is owned on a different node.
>>>>
>>>> # shared CacheStore using write-behind
>>>> When using an async CacheStore by definition the content of the
>>>> CacheStore is not trustworthy if you don't check on the owner first
>>>> for entries in memory.
>>>>
>>>> Both seem critical to me, but the performance impact is really bad too.
>>>>
>>>> I hoped to make some more tests myself but couldn't look at this
yet,
>>>> any help from the core team would be appreciated.
>>> I think you have a fair point and reads/writes to the data should be
coordinated through its owners both for performance and (more importantly) correctness.
>>> Mind creating a JIRA for this?
>>>
>>>>
>>>> @Ray, thanks for mentioning the ClusterCacheLoader. Wasn't there
>>>> someone else with a CacheLoader issue recently who had worked around
>>>> the problem by using a ClusterCacheLoader ?
>>>> Do you remember what the scenario was?
>>>>
>>>> Cheers,
>>>> Sanne
>>>>
>>>> On 15 March 2013 15:44, Adrian Nistor <anistor(a)redhat.com> wrote:
>>>>> Hi James,
>>>>>
>>>>> I'm not an expert on InfinispanDirectory but I've noticed in
[1] that
>>>>> the lucene-index cache is distributed with numOwners = 1. That means
>>>>> each cache entry is owned by just one cluster node and there's
nowhere
>>>>> else to go in the cluster if the key is not available in local
memory,
>>>>> thus it needs fetching from the cache store. This can be solved with
>>>>> numOwners > 1.
>>>>> Please let me know if this solves your problem.
>>>>>
>>>>> Cheers!
>>>>>
>>>>> On 03/15/2013 05:03 PM, James Aley wrote:
>>>>>> Hey all,
>>>>>>
>>>>>> <OT>
>>>>>> Seeing as this is my first post, I wanted to just quickly thank
you
>>>>>> all for Infinispan. So far I'm really enjoying working with
it - great
>>>>>> product!
>>>>>> </OT>
>>>>>>
>>>>>> I'm using the InfinispanDirectory for a Lucene project at the
moment.
>>>>>> We use Lucene directly to build a search product, which has high
read
>>>>>> requirements and likely very large indexes. I'm hoping to
make use of
>>>>>> a distribution mode cache to keep the whole index in memory
across a
>>>>>> cluster of machines (the index will be too big for one server).
>>>>>>
>>>>>> The problem I'm having is that after loading a
filesystem-based Lucene
>>>>>> directory into InfinispanDirectory via LuceneCacheLoader, no
nodes are
>>>>>> retrieving data from the cluster - they instead look up keys in
their
>>>>>> local CacheLoaders, which involves lots of disk I/O and is very
slow.
>>>>>> I was hoping to just use the CacheLoader to initialize the
caches, but
>>>>>> from there on read only from RAM (and network, of course). Is
this
>>>>>> supported? Maybe I've misunderstood the purpose of the
CacheLoader?
>>>>>>
>>>>>> To explain my observations in a little more detail:
>>>>>> * I start a cluster of two servers, using [1] as the cache
config.
>>>>>> Both have a local copy of the Lucene index that will be loaded
into
>>>>>> the InfinispanDirectory via the loader. This is a test
configuration,
>>>>>> where I've set numOwners=1 so that I only need two servers
for
>>>>>> distribution to happen.
>>>>>> * Upon startup, things look good. I see the memory usage of the
JVM
>>>>>> reflect a pretty near 50/50 split of the data across both
servers.
>>>>>> Logging indicates both servers are in the cluster view, all
seems
>>>>>> fine.
>>>>>> * When I send a search query to either one of the nodes, I notice
the following:
>>>>>> - iotop shows huge (~100MB/s) disk I/O on that node alone from
the
>>>>>> JVM process.
>>>>>> - no change in network activity between nodes (~300b/s, same as
when idle)
>>>>>> - memory usage on the node running the query increases
dramatically,
>>>>>> and stays higher even after the query is finished.
>>>>>>
>>>>>> So it seemed to me like each node was favouring use of the
CacheLoader
>>>>>> to retrieve keys that are not in memory, instead of using the
cluster.
>>>>>> Does that seem reasonable? Is this the expected behaviour?
>>>>>>
>>>>>> I started to investigate this by turning on trace logging, in
this
>>>>>> made me think perhaps the cause was that the CacheLoader's
interceptor
>>>>>> is higher priority in the chain than the the distribution
interceptor?
>>>>>> I'm not at all familiar with the design in any level of
detail - just
>>>>>> what I picked up in the last 24 hours from browsing the code, so
I
>>>>>> could easily be way off. I've attached the log snippets I
thought
>>>>>> relevant in [2].
>>>>>>
>>>>>> Any advice offered much appreciated.
>>>>>> Thanks!
>>>>>>
>>>>>> James.
>>>>>>
>>>>>>
>>>>>> [1]
https://www.refheap.com/paste/12531
>>>>>> [2]
https://www.refheap.com/paste/12543
>>>>>> _______________________________________________
>>>>>> infinispan-dev mailing list
>>>>>> infinispan-dev(a)lists.jboss.org
>>>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev(a)lists.jboss.org
>>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev(a)lists.jboss.org
>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>
>>> Cheers,
>>> --
>>> Mircea Markus
>>> Infinispan lead (
www.infinispan.org)
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev(a)lists.jboss.org
>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev(a)lists.jboss.org
>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev(a)lists.jboss.org
>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> Cheers,
> --
> Mircea Markus
> Infinispan lead (
www.infinispan.org)
>
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Cheers,
--
Mircea Markus
Infinispan lead (
www.infinispan.org)
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev