On 19 Jun 2013, at 17:28, William Burns <mudokonman(a)gmail.com> wrote:
On Wed, Jun 19, 2013 at 10:19 AM, Sanne Grinovero <sanne(a)infinispan.org> wrote:
On 19 June 2013 13:44, William Burns <mudokonman(a)gmail.com> wrote:
> All the L1 data for a DIST cache is stored in the same data container as the
> actual distributed data itself. I wanted to propose breaking this out so
> there is a separate data container for the L1 cache as compared to the
> distributed data.
>
> I thought of a few quick benefits/drawbacks:
>
> Benefits:
> 1. L1 cache can be separately tuned - L1 maxEntries for example
-1!
I don't think thats a benefit actually, from the point of view of a user:
as a user I only know I have a certain amount of memory available on
each node, and the application is going to use certain data way more
often than others.
The eviction strategy should be put in condition to be able to make an
optimal choice about which entries - among all - are better kept in
memory vs. passivated.
I don't see a specific reason to "favour" keeping in memory owned
entries over an L1 entry: the L1 entry might be very hot, and the
owned entry might be almost-never read.
Considering that even serving a Get operation to another node (as
owners of the entry) makes the entry less likely to be passivated (it
counts as a "hit"), the current design naturally provides an optimal
balance for memory usage.
At the opposite site, I don't see how - as a user - I could optimally
tune a separate container.
I agree that is more difficult to configure, this was one of my points as both a drawback
and benefit. It sounds like in general you don't believe the benefits outweigh the
drawbacks then.
> 2. L1 values will not cause eviction of real data
>-1
>That's not a benefit, as I explained above. "Real Data" is not less
>important, especially if it's never used.
>Granted I'm making some assumptions about the application having some
>hot-data and some less hot data, and not being able to take advantage
>of node pinning or affinity strategies.. but that is another way to
>say that I'm assuming the user needs L1: if it was possible to apply
>these more advanced strategies I'd disable L1 altogether.
In regards to "real data" versus L1. I don't see how having the containers
separated would be an issue for very hot data either way. In either case the hottest data
would be promoted in the LIRS in either cache. The only way having the containers
together would be different was if you had more hot L1 entries than your currently sized
L1 maxEntries and the data container could have removed some of the "real data"
to make room, but that seems unlikely unless you sized the L1 cache very small.
That
requires the user to tune and size the L1 cache vs DataContainer. As Sanne mentioned, the
current solution provides a nice balance between L1 and DC.
But either way L1 always has a lifespan, so even if it is the
hottest data ever it will be removed at the end of that lifespan no matter what.
I
think increasing the lifespan should mitigate that.
Cheers,
--
Mircea Markus
Infinispan lead (
www.infinispan.org)