[infinispan-dev] Lock amortization preliminary performance numbers

Manik Surtani manik at jboss.org
Thu Feb 4 04:53:19 EST 2010


On 3 Feb 2010, at 00:49, Bryan Thompson wrote:

> Vladimir,
>  
> I now have a workload where I have a hotspot on our cache which is 8% of the total time.  I am going to use this to test BCHM under a realistic workload.
>  
> One thing which I like about your design is that by placing the array buffering the operations for a thread within the Segment, you are using the lock guarding the Segment to control updates to that array.  While this has less potential throughput than a truely thread-local design (which is non-blocking until the thread-local array is full), it seems that the array buffering the updates can not "escape" under your design.  Was this intentional?
>  
> The danger with a true thread-local design is that a thread can come in and do some work, get some updates buffered in its thread-local array, and then never visit again.  In this case those updates would remain buffered on the thread and would not in fact cause the access order to be updated in a timely manner.  Worse, if you are relying on WeakReference semantics, the buffered updates would remain strongly reachable and the corresponding objects would be wired into the cache.
>  
> I've worked around this issue in a different context where it made sense to scope the buffers to an owning object (a B+Tree instance).  In that case, when the B+Tree container was closed, all buffered updates were discarded.  This nicely eliminated the problems with "escaping" threads.

Right, but in Infinispan's case it is hard to apply such a workaround since oftentimes we have no control over the worker thread (think Tomcat HTTP worker using Infinispan to distribute http session state).

So while I think it is *possible* to use the ThreadLocal approach, it would be complex and involve additional interception of invocations to ensure proper cleanup.  Let's see what works as we test further!

Cheers
Manik

>  
> However, I like your approach better.
>  
> Bryan
>  
> PS: I see a comment on LRU#onEntryHit() indicating that it is invoked without holding the Segment lock.  Is that true only when BCHM#get() is invoked?
>  
> PPS: Also, can you expand on the role of the LRU#accessQueue vs LRU#lruQueue?  Is there anything which corresponds to the total LRU order (after batching updates through the Segment)?
>  
> From: infinispan-dev-bounces at lists.jboss.org [mailto:infinispan-dev-bounces at lists.jboss.org] On Behalf Of Bryan Thompson
> Sent: Wednesday, January 27, 2010 1:00 PM
> To: infinispan -Dev List
> Subject: Re: [infinispan-dev] Lock amortization preliminary performance numbers
> 
> Vladimir,
>  
> I recently replaced a synchronized ring buffer behind a ConcurrentHashMap with weak values with a variation of the ring buffer which buffers updates in thread-local arrays and the batches them through the lock.  For the specific application (a cache of frequently resolved terms in a dictionary) contention went to zero.  We    use a very similar design with a ring buffer to hold onto recently touched B+Tree nodes and leaves.  This was a hot spot for concurrent readers as they needed to synchronize when adding the reference to the ring buffer.  I made a similar change, to use a variant of the ring buffer with thread-local buffers and batching through the lock, and that contention has also disappeared completely.  I have been using between a 64 and 128 element per-thread array with barging into the lock at 50% of the buffer capacity.
>  
> So, overall, this approach is working out very nicely for us in combination with WeakReferences.  One drawback of the approach is that hard references remain in the thread-local arrays until they are batched through the lock.  Depending on how threads are managed, it is possible for hard references to remain in the thread local queues long past their last touch.  For this reason we do not use this technique when the cache    contains expensive resources where we need a better guarantee of timely discard when they resource is no longer in use by the application.  It might be possible to address this concern with a protocol to attach and detach the thread-local buffers from threads on their way into / out of a thread pool.
>  
> I have not yet tried to use this approach in our main data record cache (caching data records read from the    disk in memory).  We have a hot spot there still, but I am quite certain that we will be able to address it using similar techniques.  I will try out the infinispan BufferedConcurrentHashMap in that context soon.
>  
> Thanks,
>  
> Bryan
>  
> 
> From: infinispan-dev-bounces at lists.jboss.org [mailto:infinispan-dev-bounces at lists.jboss.org] On Behalf Of Vladimir Blagojevic
> Sent: Wednesday, January 27, 2010 10:30 AM
> To: infinispan -Dev List
> Subject: [infinispan-dev] Lock amortization preliminary performance numbers
> 
> Hi all,
> 
> As you probably recall I am working on a LIRS eviction algorithm. But before we get there, Manik and I agreed that I push in a direction of implementing ConcurrentHashMap(CHM) variant that will do lock amortization and eviction per segment of CHM. So far I have implemented LRU eviction so we can verify feasibility of this approach rather than eviction algorithm itself. We are hoping that eviction algorithm precision will not suffer if we do eviction per segment and you all are familiar with lock striping benefits of segments in CHM. 
> 
> So, I've cobbled up a first test to compare eviction and lock amortized enabled CHM (BCHM) with regular CHM and synchronized HashMap. The test is actually based on already existing test [1] used to measure performance of various DataContainers. I've changed it slightly to measure Map directly instead of DataContainer. The test launches in cohort 48 reader, 4 writer and 1 remover threads. All operations are randomized; readers execute map.get(R.nextInt(NUM_KEYS), writers map.put(R.nextInt(NUM_KEYS), "value"), and finally removers execute map.remove(R.nextInt(NUM_KEYS). NUM_KEYS was set to 10K. Each thread does in a loop 1K ops. 
> 
> Initial capacity for CHM and HashMap were set to 1K, max capacity for eviction and lock amortized enabled CHM was set to 256; therefore BCHM has to do a      lot of evictions which is evident in map final size listed below.
> 
> Size = 9999
> Performance for container ConcurrentHashMap
> Average get ops/ms 338
> Average put ops/ms 87
> Average remove ops/ms 171
> 
> Size = 193
> Performance for container BufferedConcurrentHashMap
> Average get ops/ms 322
> Average put ops/ms 32
> Average remove ops/ms 74
> 
> Size = 8340
> Performance for container SynchronizedMap
> Average get ops/ms 67
> Average put ops/ms 45
> Average remove ops/ms 63
> 
> If I remove lock amortization for BufferedConcurrentHashMap, that is if we attempt eviction on every get/put/remove, performance of put/remove for BCHM goes to zero basically! As far as I can interpret these results, BCHM get operation performance does not suffer at all in comparison with CHM as it does for single lock HashMap. Predictably, for single lock HashMap each type of operation takes on avg almost the same amount of time. We pay a hit in BCHM put/remove operations in comparison with CHM but the numbers are promising, I think. If  you have comments, or even suggestions on how to test these map variants in a different, possibly more insightful approach, speak up! 
> 
> Cheers,
> Vladimir
> 
> 
> [1] http://fisheye.jboss.org/browse/~raw,r=1264/Infinispan/trunk/core/src/test/java/org/infinispan/stress/DataContainerStressTest.java
> 
> 
> 
> 
> 
>  
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Manik Surtani
manik at jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org




-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20100204/ba0f617f/attachment-0002.html 


More information about the infinispan-dev mailing list