[infinispan-dev] DataContainer performance review

Mon Jul 4 03:19:09 EDT 2011

On Jun 30, 2011, at 12:08 PM, Mircea Markus wrote:

> 
> On 30 Jun 2011, at 01:18, Vladimir Blagojevic wrote:
> 
>> Hey, good news!
>> 
>> I have found that a main culprit of a poor DataContainer performance for 
>> large caches (100K entries +) is in fact use of default concurrency of 
>> 32.
> Does that cause BCHM to create only 32 segments, that resulting in lots of contention on concurrent updates?
>> If users are going to use caches with many entries then they should 
>> also increase concurrency level. I found that concurrency of 512 works 
>> fairly well for caches up to million entries. Also note that if users 
>> are using such large caches (1M+ entries) I do not see the point of 
>> having eviction, they should just use unbounded DataContainer.

By the way, I forgot to mention, why isnt there a point of using eviction with 1M+ entries? I still wanna try to keep my memory consumption in check, regardless of the amount of data that I put in the cache.

What is it that you're trying to imply here exactly? That the data container is not performant once the number of max entries goes beyond some limit?

> I'm not sure this is true for all use cases: e.g. 1M Integers occupy cca 4Mb, and people might want to allocated up to Gb to cache data. 
> What I think we can do is suggest them(log), based on the  DC size, to increase the concurrencyLevel when needed.

I think we should go further, we should try to be more ergonomic, adapt to the circumstance and do as much as we can so that the data container is tuned at runtime. And this applies not only to the data container, but buffer sizes...etc.

Logging suggestions is poor man's tuning.

>> 
>> I am also looking to chart these for easy review, forum post, and 
>> DataContainer performance tuning wiki. Tomorrow I'll determine impact of 
>> passivation on DataContainer performance.
>> 
>> Cheers,
>> Vladimir
>> 
>> 
>> On 11-06-28 10:20 AM, Vladimir Blagojevic wrote:
>>> On 11-06-28 10:06 AM, Galder Zamarreño wrote:
>>>> Vladimir,
>>>> 
>>>> I think it's better if you run your tests in one of the cluster or perf machines cos that way everyone has access to the same base system and results can be compared, particularly when changes are made. Also, you avoid local apps or CPU usage affecting your test results.
>>>> 
>>>> I agree with Sanne, put ops for LIRS don't look go in comparison with LRU. Did you run some profiling?
>>>> 
>>> Hey,
>>> 
>>> Very likely you are right and it is a better approach but it does not
>>> take much to notice a trend of deteriorating BCHM performance for large
>>> map/cache size. Looking to do some profiling now.
>>> 
>>> Cheers
>>> 
>>> 
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache