Re: [infinispan-dev] Need help

Monday, 7 October 2013

On 10/07/2013 11:43 AM, Sanne Grinovero wrote:
...

 Proposal:
 let's be more aggressive on validation.
   - make preload mandatory (for the metadata only)
   - eviction not allowed (for the metadata only)

 Note that I don't think we're putting much of a restriction to users,
 we're more likely helping
 with good guidance on how these caches are supposed to work.
 I think it's a good thing to narrow down the possible configuration options
 and make it clear what we expect people to use for this purpose. 
it's good enough for me. :)

...

>> 3## The CacheLoader loading the same entry multiple times in parallel
>> Kudos for finding out that there are situations in which we deal with
>> multiple different instances of ConcurrentHashSet! Still, I think that
>> Infinispan core is terribly wrong in this case:
>> from the client code POV a new CHS is created with a put-if-absent
>> atomic operation, and I will assume there that core will check/load
>> any enabled cachestore as well.
>> To handle multiple GET operations in parallel, or even in parallel
>> with preload or the client's put-if-absent operation, I would *not*
>> expect Infinispan core to storm the CacheStore implementation with
>> multiple load operations on the same put: a lock should be hold on the
>> potential key during such a load operation.
>
>
> I didn't understand what you mean here ^^. Infinispan only tries to load
> from a cache store once per operation (get, put, remove, etc...).
>
> however, I think we have a big windows in which multiple operations can load
> the same key from the store. This happens because we only write to the data
> container after the operation end.

 I think that's a very important consideration.

 Imagine a REST cache is built on Infinispan, using a filesystem based
 CacheStore, and we receive a million requests per second
 for a key "X1", which doesn't exist.
 Since each request is not going to find it in the data container, each
 request is triggering a CacheStore load.
 Do you want to enqueue a million IO operations to the disk?
 I'd much rather have some locking in memory, that will prevent nut
 just disks from struggling but also allow the CPUs to context switch
 to perform more useful tasks: being blocked on a lock is a good thing
 in this case, rather than allowing "parallel processing of pointless
 requests". That's just parallel energy burning, and potentially DOSing
 yourself regarding more interesting requests which could be handled
 instead.

 To translate it into an implementation design, you need to lock the
 "X1" key before attempting to load it from the CacheStore, and when
 the lock is acquired the data container needs to be re-checked as it
 might have been loaded by a parallel thread.
 Or alternatively, lock the key before attempting to read from the data
 container: that's not too bad, it's a very brief local only lock. 
you have good point here, but IMO this is a high risk operation because 
we are getting close in the final release. and I don't want to make the 
cache store unusable...

but, let's see what others think about it.

my 1 cent pseudo-code suggestion:

lock(key) {
   value = dataContainer.get(key)
   if value is not null; then return value
   loaded = cacheLoader.load(key)
   if loaded is null; then return null
   existing = dataContainer.putIfAbsent(key, loaded)
   if existing is null; then return loaded; otherwise existing
}

...

 Worth a different thread? Not that I don't know how Infinispan core is
 implementing this today, I'm just understanding from you that it's not
 as good as I'd expected it.

 This might also be the reason for some people to have recorded a very
 bad performance when writing stress tests on the CacheStore, and of
 course such a benchmark would highlight too much IO going on.. but
 it's not how fast we do the IO which is the problem if we can just
 avoid some of them.

>
>
>>
>> If your observation is right, this could also be one of the reasons
>> for so many complaints on the CacheStore performance: under these
>> circumstances - which I'd guesstimate are quite common - we're doing
>> lots of unnecessary IO, potentially stressing the slower storage
>> devices. This could even have dramatic effects if there are frequent
>> requests for entries for which the returned value is a null.
>>
>>    # Action 3: Investigate and open a JIRA about the missing locking on
>> CacheStore load operations.
>>
>> If this where resolves, we would still have a guarantee of single
>> instance, am I correct?
>>
>
> cache stores acquire the read lock to load a key. I'm not sure if exclusive
> locking a key would help in anything.

 See above: allowing parallelism on loads is not a good idea as it will
 storm the CacheStore with redundant requests.

 Sanne

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] Need help