<div dir="ltr"><div>Right. If we have anywhere a map that&#39;s initialized from a single thread and then accessed only for reading from many threads, it probably makes sense to use a HashMap and wrap it in an UnmodifiableMap. But if it can be written from multiple threads as well, I think we should use a CHMV8.<br>


<br></div>BTW, the HashMap implementation in OpenJDK 1.7 seems to have some anti-collision features (a VM-dependent hash code generator for Strings), but our version of CHMV8 doesn&#39;t. Perhaps we need to upgrade to the latest CHMV8 version?<br>


<br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Apr 19, 2013 at 4:32 PM, David M. Lloyd <span dir="ltr">&lt;<a href="mailto:david.lloyd@redhat.com" target="_blank">david.lloyd@redhat.com</a>&gt;</span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On 04/19/2013 08:22 AM, Sanne Grinovero wrote:<br>

&gt; On 19 April 2013 13:52, David M. Lloyd &lt;<a href="mailto:david.lloyd@redhat.com">david.lloyd@redhat.com</a>&gt; wrote:<br>

&gt;&gt; On 04/19/2013 05:17 AM, Sanne Grinovero wrote:<br>

&gt;&gt;&gt; On 19 April 2013 11:10, Dan Berindei &lt;<a href="mailto:dan.berindei@gmail.com">dan.berindei@gmail.com</a>&gt; wrote:<br>

&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt; On Fri, Apr 19, 2013 at 12:58 PM, Sanne Grinovero &lt;<a href="mailto:sanne@infinispan.org">sanne@infinispan.org</a>&gt;<br>

&gt;&gt;&gt;&gt; wrote:<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt; On 19 April 2013 10:37, Dan Berindei &lt;<a href="mailto:dan.berindei@gmail.com">dan.berindei@gmail.com</a>&gt; wrote:<br>

&gt;&gt;&gt;&gt;&gt;&gt; Testing mixed read/write performance with capacity 100000, keys 300000,<br>

&gt;&gt;&gt;&gt;&gt;&gt; concurrency level 32, threads 12, read:write ratio 99:1<br>

&gt;&gt;&gt;&gt;&gt;&gt; Container CHM           Ops/s 5178894.77  Gets/s 5127105.82  Puts/s<br>

&gt;&gt;&gt;&gt;&gt;&gt; 51788.95  HitRatio      86.23  Size     177848  stdDev   60896.42<br>

&gt;&gt;&gt;&gt;&gt;&gt; Container CHMV8         Ops/s 5768824.37  Gets/s 5711136.13  Puts/s<br>

&gt;&gt;&gt;&gt;&gt;&gt; 57688.24  HitRatio      84.72  Size     171964  stdDev   60249.99<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt; Nice, thanks.<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt; The test is probably limited by the 1% writes, but I think it does show<br>

&gt;&gt;&gt;&gt;&gt;&gt; that<br>

&gt;&gt;&gt;&gt;&gt;&gt; reads in CHMV8 are not slower than reads in OpenJDK7&#39;s CHM.<br>

&gt;&gt;&gt;&gt;&gt;&gt; I haven&#39;t measured it, but the memory footprint should also be better,<br>

&gt;&gt;&gt;&gt;&gt;&gt; because it doesn&#39;t use segments any more.<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt; AFAIK the memoryCHMV8 also uses copy-on-write at the bucket level, but<br>

&gt;&gt;&gt;&gt;&gt;&gt; we<br>

&gt;&gt;&gt;&gt;&gt;&gt; could definitely do a pure read test with a HashMap to see how big the<br>

&gt;&gt;&gt;&gt;&gt;&gt; performance difference is.<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt; By copy-on-write I didn&#39;t mean on the single elements, but on the<br>

&gt;&gt;&gt;&gt;&gt; whole map instance:<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt; private volatile HashMap configuration;<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt; synchronized addConfigurationProperty(String, String) {<br>

&gt;&gt;&gt;&gt;&gt;        HashMap newcopy = new HashMap( configuration ):<br>

&gt;&gt;&gt;&gt;&gt;        newcopy.put(..);<br>

&gt;&gt;&gt;&gt;&gt;        configuration = newcopy;<br>

&gt;&gt;&gt;&gt;&gt; }<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt; Of course that is never going to scale for writes, but if writes stop<br>

&gt;&gt;&gt;&gt;&gt; at runtime after all services are started I would expect that the<br>

&gt;&gt;&gt;&gt;&gt; simplicity of the non-threadsafe HashMap should have some benefit over<br>

&gt;&gt;&gt;&gt;&gt; CHM{whatever}, or it would have been removed already?<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt; Right, we should be able to tell whether that&#39;s worth doing with a pure read<br>

&gt;&gt;&gt;&gt; test with a CHMV8 and a HashMap :)<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; IFF you find out CHMV8 is as good as HashMap for read only, you have<br>

&gt;&gt;&gt; two options:<br>

&gt;&gt;&gt;    - ask the JDK team to drop the HashMap code as it&#39;s no longer needed<br>

&gt;&gt;&gt;    - fix your benchmark :-P<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; In other words, I&#39;d consider it highly surprising and suspicious<br>

&gt;&gt;&gt; (still interesting though!)<br>

&gt;&gt;<br>

&gt;&gt; It&#39;s not as surprising as you think.  On x86, volatile reads are the<br>

&gt;&gt; same as regular reads (not counting some possible reordering magic).  So<br>

&gt;&gt; if a CHM read is a hash, an array access, and a list traversal, and so<br>

&gt;&gt; is HM (and I believe this is true though I&#39;d have to review the code<br>

&gt;&gt; again to be sure), I&#39;d expect very similar execution performance on<br>

&gt;&gt; read.  I think some of the anti-collision features in V8 might come into<br>

&gt;&gt; play under some circumstances though which might affect performance in a<br>

&gt;&gt; negative way (wrt the constant big-O component) but overall in a<br>

&gt;&gt; positive way (by turning the linear big-O component into a logarithmic one).<br>

&gt;<br>

&gt; Thanks David. I know about the cost of a volatile read, what I&#39;m referring to<br>

&gt; is that I would expect the non-concurrent Maps to generally contain some<br>

&gt; simpler code than a conccurrent one. If this was not the case,<br>

&gt; why would any JDK team maintain two different implementations?<br>

&gt; That&#39;s why I would consider it surprising if it turned out that the CHMV8 was<br>

&gt; superior over a regular one on all fronts: there certainly is some<br>

&gt; scenario in which the regular one would be a more appropriate choice,<br>

&gt; which directly proofs that blindly replacing all usages in a large project<br>

&gt; is not optimal. Of course, it might be close to optimal..<br>

<br>

</div></div>You are right, it is not superior on all fronts.  It is definitely<br>

similar in terms of read, but writes will have a substantially higher<br>

cost, involving (at the very least) multiple volatile writes which are<br>

orders of magnitude more expensive than normal writes (on Intel they<br>

have the costly impact of memory fence instructions).  So I don&#39;t think<br>

anyone will want to drop HashMap any time soon. :-)<br>

<div class="HOEnZb"><div class="h5"><br>

--<br>

- DML<br>

_______________________________________________<br>

infinispan-dev mailing list<br>

<a href="mailto:infinispan-dev@lists.jboss.org">infinispan-dev@lists.jboss.org</a><br>

<a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" target="_blank">https://lists.jboss.org/mailman/listinfo/infinispan-dev</a><br>

</div></div></blockquote></div><br></div>