On 8 Jan 2010, at 12:35, Bryan Thompson wrote:
Vladimir,
Reading the paper, it appears that my suggestion of segments is called "distributed
locks" in this paper and the authors suggests:
> For example, the algorithms that need to detect sequence of accesses cannot retain
their performance advantages when pages in the same sequence have been distributed into
multiple partitions and cannot be identified as a sequence.
Overall, I like the approach outlined in the paper. It takes the concept of batching
further than what we did, but in much the same spirit. In our implementation, we only
batched evictions. They are also batching the updates to the replacement policy ordering
which is still a bottleneck for us.
Concerning the FIFO queue, the paper (page 372) considers both per-thread queues or
global queues. They dismiss global queues for batching updates primarily because certain
cache replacement policies need to detect sequences of operations. Can such sequences of
operations be meaningfully detected within infinispan? Is there any thread-affinity for a
transaction?
No there isn't - although there could be, as a side effect of certain use cases and
access patterns, but this should not be expected.
If not, then segmented queues might be an alternative for
infinispan.
Also, for java, if the hash map uses weak values then cache entries then any entries on
the FIFO queues will remain pinned in the hash map even if they would have otherwise been
evicted by the cache policy which is a nice touch. That way you can continue to access an
object which remains accessible to the JVM held even if the cache replacement policy would
have discarded the cache entry.
The experiment in the paper compared several cache replacement policies, but the data
were all hot in cache and the cache was large enough to avoid all cache misses, so I am
not clear why they bothered to compare different cache replacement policies -- none of
them did any evictions for their tests. While "prefetching" did reduce the lock
contention slightly, it had no meaningful effect on either throughput or response time.
The authors posit that it might have more effect on machines with many more cores,
depending on the CPU architecture. It seems easily enough to ignore the prefetch
dimension for now and just concentrate on batching.
+1
So, I think that the batching approach per this paper makes sense and could be combined
with LRU or LIRS easily since the cache replacement policy is encapsulated (which might
not be true with prefetch) and seems overall like a good idea and an improvement on what
we were already doing. For our application, I think that we would use a hash map with
weak values (per above) to gain the benefit of the actual retention time of the object by
the JVM. I would also like to explore the use of segmented queues as well as per-thread
queues to control the #of queues in the system separately from the #of threads. By
batching updates to the access order, you are also effectively batching evictions since
they are driven by updates. When combined with a weak value caches, the thread running
the updates should also clear entries from the cache on the weak reference queue, thereby
batching the clearing of stale cache entries (those with cleared references) at the same
time it is batching updates to the access order and inserts to the cache.
Bryan
From: infinispan-dev-bounces(a)lists.jboss.org
[mailto:infinispan-dev-bounces@lists.jboss.org] On Behalf Of Vladimir Blagojevic
Sent: Friday, January 08, 2010 4:42 AM
To: infinispan -Dev List
Subject: Re: [infinispan-dev] Implementing LIRS eviction policy [ISPN-299]
Hey Bryan,
On 2010-01-08, at 2:09 AM, Bryan Thompson wrote:
> Valdimir,
>
> I have used a similar approach to "batching" evictions with good success.
It reduced contention for the lock to handle eviction by about twenty percent.
How different, if at all, was your batching implementation from the one I referred to in
my previous mail?
>
> For this cache, which was not realized using infinispan, I did NO eviction until the
cache reached a target percentage of the JVM heap (this kept the cache hot). For our
case, this was easy as the objects entered in the cache were database records coded as
byte[]s so have a good measure for their memory footprint. Once adding another record to
the cache would push the cache size over the target memory limit, that thread would gain a
lock and then evict a batch of records (in an LRU fashion) until the cache size had been
driven down by a threashold (e.g., 5% of the target memory was recovered).
>
> Let me suggest an alternative approach. Neither LIRS nor LRU need to be strong
orderings. It is enough that a cache eviction policy be "approximately" LRU (or
LIRS). As long as the least desirable entries wind up getting evicted, it does not matter
what the precise eviction order is.
Can you clarify "need to be strong orderings"? Aha, you are referring to
eviction order?
>
> If you accept "approximate" eviction orders, then striped locks could be
used to reduce thread contention. Evictions could be driven on each segment of the
cache independently. If we hash the cache entry key to choose the segment, then each
cache entry will only appear in one segment and all segments should be driven more or less
equally by the application workload (assuming a decent hash function). So using 16
segments / locks, you have 1/16th the contention.
>
> Thoughts?
So what you are basically saying is that we implement LIRS and LRU per Segment of
ConcurrentHashMap, is that right? Interesting idea!
Does your proposal still contain batching? If so, how would you fit it in this new
design? You would have to keep batched record of cache entry accesses in a queue per
segment?
Regards,
Vladimir
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev
--
Manik Surtani
manik(a)jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org