[infinispan-dev] L1 Data Container

Mon Jul 8 09:07:08 EDT 2013

On 19 Jun 2013, at 17:28, William Burns <mudokonman at gmail.com> wrote:

> 
> 
> 
> On Wed, Jun 19, 2013 at 10:19 AM, Sanne Grinovero <sanne at infinispan.org> wrote:
> On 19 June 2013 13:44, William Burns <mudokonman at gmail.com> wrote:
> > All the L1 data for a DIST cache is stored in the same data container as the
> > actual distributed data itself.  I wanted to propose breaking this out so
> > there is a separate data container for the L1 cache as compared to the
> > distributed data.
> >
> > I thought of a few quick benefits/drawbacks:
> >
> > Benefits:
> > 1. L1 cache can be separately tuned - L1 maxEntries for example
> 
> -1!
>   I don't think thats a benefit actually, from the point of view of a user:
> as a user I only know I have a certain amount of memory available on
> each node, and the application is going to use certain data way more
> often than others.
> The eviction strategy should be put in condition to be able to make an
> optimal choice about which entries - among all - are better kept in
> memory vs. passivated.
> I don't see a specific reason to "favour" keeping in memory owned
> entries over an L1 entry: the L1 entry might be very hot, and the
> owned entry might be almost-never read.
> Considering that even serving a Get operation to another node (as
> owners of the entry) makes the entry less likely to be passivated (it
> counts as a "hit"), the current design naturally provides an optimal
> balance for memory usage.
> 
> At the opposite site, I don't see how - as a user - I could optimally
> tune a separate container.
> I agree that is more difficult to configure, this was one of my points as both a drawback and benefit.   It sounds like in general you don't believe the benefits outweigh the drawbacks then.
> 
> > 2. L1 values will not cause eviction of real data
> 
> >-1
> >That's not a benefit, as I explained above. "Real Data" is not less
> >important, especially if it's never used.
> >Granted I'm making some assumptions about the application having some
> >hot-data and some less hot data, and not being able to take advantage
> >of node pinning or affinity strategies.. but that is another way to
> >say that I'm assuming the user needs L1: if it was possible to apply
> >these more advanced strategies I'd disable L1 altogether.
> In regards to "real data" versus L1.  I don't see how having the containers separated would be an issue for very hot data either way.  In either case the hottest data would be promoted in the LIRS in either cache.  The only way having the containers together would be different was if you had more hot L1 entries than your currently sized L1 maxEntries and the data container could have removed some of the "real data" to make room, but that seems unlikely unless you sized the L1 cache very small.
That requires the user to tune and size the L1 cache vs DataContainer. As Sanne mentioned, the current solution provides a nice balance between L1 and DC.
>  But either way L1 always has a lifespan, so even if it is the hottest data ever it will be removed at the end of that lifespan no matter what.
I think increasing the lifespan should mitigate that.

Cheers,
-- 
Mircea Markus
Infinispan lead (www.infinispan.org)