<div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Jun 19, 2013 at 10:19 AM, Sanne Grinovero <span dir="ltr">&lt;<a href="mailto:sanne@infinispan.org" target="_blank">sanne@infinispan.org</a>&gt;</span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div>On 19 June 2013 13:44, William Burns &lt;<a href="mailto:mudokonman@gmail.com" target="_blank">mudokonman@gmail.com</a>&gt; wrote:<br>


&gt; All the L1 data for a DIST cache is stored in the same data container as the<br>

&gt; actual distributed data itself.  I wanted to propose breaking this out so<br>

&gt; there is a separate data container for the L1 cache as compared to the<br>

&gt; distributed data.<br>

&gt;<br>

&gt; I thought of a few quick benefits/drawbacks:<br>

&gt;<br>

&gt; Benefits:<br>

&gt; 1. L1 cache can be separately tuned - L1 maxEntries for example<br>

<br>

</div>-1!<br>

  I don&#39;t think thats a benefit actually, from the point of view of a user:<br>

as a user I only know I have a certain amount of memory available on<br>

each node, and the application is going to use certain data way more<br>

often than others.<br>

The eviction strategy should be put in condition to be able to make an<br>

optimal choice about which entries - among all - are better kept in<br>

memory vs. passivated.<br>

I don&#39;t see a specific reason to &quot;favour&quot; keeping in memory owned<br>

entries over an L1 entry: the L1 entry might be very hot, and the<br>

owned entry might be almost-never read.<br>

Considering that even serving a Get operation to another node (as<br>

owners of the entry) makes the entry less likely to be passivated (it<br>

counts as a &quot;hit&quot;), the current design naturally provides an optimal<br>

balance for memory usage.<br>

<br>

At the opposite site, I don&#39;t see how - as a user - I could optimally<br>

tune a separate container.<br></blockquote><div style>I agree that is more difficult to configure, this was one of my points as both a drawback and benefit.   It sounds like in general you don&#39;t believe the benefits outweigh the drawbacks then.</div>

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

<div><br>

&gt; 2. L1 values will not cause eviction of real data<br>

<br>

</div>-1<br>

That&#39;s not a benefit, as I explained above. &quot;Real Data&quot; is not less<br>

important, especially if it&#39;s never used.<br>

Granted I&#39;m making some assumptions about the application having some<br>

hot-data and some less hot data, and not being able to take advantage<br>

of node pinning or affinity strategies.. but that is another way to<br>

say that I&#39;m assuming the user needs L1: if it was possible to apply<br>

these more advanced strategies I&#39;d disable L1 altogether.<br></blockquote><div style>In regards to &quot;real data&quot; versus L1.  I don&#39;t see how having the containers separated would be an issue for very hot data either way.  In either case the hottest data would be promoted in the LIRS in either cache.  The only way having the containers together would be different was if you had more hot L1 entries than your currently sized L1 maxEntries and the data container could have removed some of the &quot;real data&quot; to make room, but that seems unlikely unless you sized the L1 cache very small.  But either way L1 always has a lifespan, so even if it is the hottest data ever it will be removed at the end of that lifespan no matter what.</div>

<div style><br></div><div style>To me a L1 cache value is of lower priority than real data - especially the fact that we can&#39;t even guarantee how long the L1 value will be around for.  I guess that assumption may be incorrect though.  </div>

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

<div><br>

&gt; 3. Would make <a href="https://issues.jboss.org/browse/ISPN-3229" target="_blank">https://issues.jboss.org/browse/ISPN-3229</a> an easy fix<br>

&gt; 4. Could add a new DataContainer implementation specific to L1 with<br>

&gt; additional optimizations<br>

<br>

</div>You have some example of what you have in mind?<br>

Considering you would need to consider the optimal balance usage of<br>

the available heap space, I suspect that would be quite hard.<br></blockquote><div style> Nothing that substantial right now.  I was thinking you could remove a lot of additional checks since you know that they would always be mortal entries and you wouldn&#39;t want to passivate such entries - since it will usually be faster to just ask the owning node for the value.</div>

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

<div><br>

&gt; 5. Help with some concurrency issues with L1 without requiring wider locking<br>

&gt; (such as locking a key for an entire ClusteredGet rpc call) -<br>

&gt; <a href="https://issues.jboss.org/browse/ISPN-3197" target="_blank">https://issues.jboss.org/browse/ISPN-3197</a>.<br>

<br>

</div>I don&#39;t understand this. L1 entries require the same level of<br>

consistency than any other entry so I suspect you would need the same<br>

locking patterns replicated. The downside would be that you get<br>

duplication of the same logic.<br>

Remember also that L1 is having some similarities with entries still<br>

&quot;hanging around&quot; when they where previously stored in this node after<br>

a state transfer.. today these are considered L1-active entries, if<br>

you change the storage you would need to design a migration of state<br>

from one container to the other; migration of state might not be too<br>

hard, doing it while guaranteeing consistent locking is going to be I<br>

guess as hard as considering the L1 problem today.<br></blockquote><div>There are 2 sides to this, transactional and non transactional.</div><div><br></div><div>In the non transactional case we only ever write to the L1 cache on a Get and an Invalidation, which allows for quite a bit of optimization.  In this case we can get away with only locking quickly before and after the get rpc call and still guarantee consistency with the L1.  And if the additional optimization I mentioned on Pedro&#39;s comments we can eliminate the locking as we know only a single get is done at a time.</div>

<div><br></div><div style>Transactional needs to be changed to support locking I talk about in the non tx paragraph to fix <a href="https://issues.jboss.org/browse/ISPN-2965">https://issues.jboss.org/browse/ISPN-2965</a>.  Writes in the tx model acquire the lock for the duration of the commit rpc so we are already covered from that perspective.</div>

<div style><br></div><div style>The one thing I wasn&#39;t quite clear on was rehash events, which I was talking to Dan about a bit, but as you mention should be doable.</div>

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

<div><br>

&gt;<br>

&gt; Drawbacks:<br>

&gt; 1. Would require, depending on configuration, an additional thread for<br>

&gt; eviction<br>

&gt; 2. Users upgrading could have double memory used up due to 2 data containers<br>

<br>

</div>This drawback specifically is to be considered very seriously. I don&#39;t<br>

think people would be happy to buy and maintain a twice as large<br>

datacenter than what they actually need.<br></blockquote><div style>I was only saying if they left the configuration as is and the default value was the same as the other data container size.  As mentioned by Manik and Pedro even with those settings this would unlikely scale to double. </div>

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

<span><font color="#888888"><br>

Sanne<br>

</font></span><div><br>

&gt;<br>

&gt; Both?:<br>

&gt; 1. Additional configuration available<br>

&gt;    a. Add maxEntires just like the normal data container (use data container<br>

&gt; size if not configured?)<br>

&gt;    b. Eviction wakeup timer?  We could just reuse the task cleanup<br>

&gt; frequency?<br>

&gt;    c. Eviction strategy?  I would think the default data container&#39;s would<br>

&gt; be sufficient.<br>

&gt;<br>

&gt; I was wondering what you guys thought.<br>

&gt;<br>

&gt; Thanks,<br>

&gt;<br>

&gt;  - Will<br>

&gt;<br>

</div><div><div>&gt; _______________________________________________<br>

&gt; infinispan-dev mailing list<br>

&gt; <a href="mailto:infinispan-dev@lists.jboss.org" target="_blank">infinispan-dev@lists.jboss.org</a><br>

&gt; <a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" target="_blank">https://lists.jboss.org/mailman/listinfo/infinispan-dev</a><br>

_______________________________________________<br>

infinispan-dev mailing list<br>

<a href="mailto:infinispan-dev@lists.jboss.org" target="_blank">infinispan-dev@lists.jboss.org</a><br>

<a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" target="_blank">https://lists.jboss.org/mailman/listinfo/infinispan-dev</a><br>

</div></div></blockquote></div><br></div></div>