[infinispan-dev] CHM or CHMv8?

Galder Zamarreño galder at redhat.com
Thu Apr 25 06:47:09 EDT 2013


On Apr 22, 2013, at 2:33 PM, David M. Lloyd <david.lloyd at redhat.com> wrote:

> AFAIK the JDK team is looking at dropping alternative hashing for 
> strings and instead going with the comparator-based collision resolution 
> once a bucket reaches a certain size.  I'm not sure how this is expected 
> to work for a map with multiple key types though.

^ Hmmm, I'm pretty sure I've seen something similar in CHMv8. Once it reaches certain size, tree-based bins are used, which check whether the entries are Comparable. In fact, I've accomodated this into Equivalence interface, so that we can provide a way to define whether a given key or value is comparable, i.e. for arrays.

Cheers,

> 
> On 04/22/2013 06:19 AM, Dan Berindei wrote:
>> Right. If we have anywhere a map that's initialized from a single thread
>> and then accessed only for reading from many threads, it probably makes
>> sense to use a HashMap and wrap it in an UnmodifiableMap. But if it can
>> be written from multiple threads as well, I think we should use a CHMV8.
>> 
>> BTW, the HashMap implementation in OpenJDK 1.7 seems to have some
>> anti-collision features (a VM-dependent hash code generator for
>> Strings), but our version of CHMV8 doesn't. Perhaps we need to upgrade
>> to the latest CHMV8 version?
>> 
>> 
>> 
>> On Fri, Apr 19, 2013 at 4:32 PM, David M. Lloyd <david.lloyd at redhat.com
>> <mailto:david.lloyd at redhat.com>> wrote:
>> 
>>    On 04/19/2013 08:22 AM, Sanne Grinovero wrote:
>>> On 19 April 2013 13:52, David M. Lloyd <david.lloyd at redhat.com
>>    <mailto:david.lloyd at redhat.com>> wrote:
>>>> On 04/19/2013 05:17 AM, Sanne Grinovero wrote:
>>>>> On 19 April 2013 11:10, Dan Berindei <dan.berindei at gmail.com
>>    <mailto:dan.berindei at gmail.com>> wrote:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Fri, Apr 19, 2013 at 12:58 PM, Sanne Grinovero
>>    <sanne at infinispan.org <mailto:sanne at infinispan.org>>
>>>>>> wrote:
>>>>>>> 
>>>>>>> On 19 April 2013 10:37, Dan Berindei <dan.berindei at gmail.com
>>    <mailto:dan.berindei at gmail.com>> wrote:
>>>>>>>> Testing mixed read/write performance with capacity 100000,
>>    keys 300000,
>>>>>>>> concurrency level 32, threads 12, read:write ratio 99:1
>>>>>>>> Container CHM           Ops/s 5178894.77  Gets/s 5127105.82
>>      Puts/s
>>>>>>>> 51788.95  HitRatio      86.23  Size     177848  stdDev
>>    60896.42
>>>>>>>> Container CHMV8         Ops/s 5768824.37  Gets/s 5711136.13
>>      Puts/s
>>>>>>>> 57688.24  HitRatio      84.72  Size     171964  stdDev
>>    60249.99
>>>>>>> 
>>>>>>> Nice, thanks.
>>>>>>>> 
>>>>>>>> The test is probably limited by the 1% writes, but I think
>>    it does show
>>>>>>>> that
>>>>>>>> reads in CHMV8 are not slower than reads in OpenJDK7's CHM.
>>>>>>>> I haven't measured it, but the memory footprint should also
>>    be better,
>>>>>>>> because it doesn't use segments any more.
>>>>>>>> 
>>>>>>>> AFAIK the memoryCHMV8 also uses copy-on-write at the bucket
>>    level, but
>>>>>>>> we
>>>>>>>> could definitely do a pure read test with a HashMap to see
>>    how big the
>>>>>>>> performance difference is.
>>>>>>> 
>>>>>>> By copy-on-write I didn't mean on the single elements, but on the
>>>>>>> whole map instance:
>>>>>>> 
>>>>>>> private volatile HashMap configuration;
>>>>>>> 
>>>>>>> synchronized addConfigurationProperty(String, String) {
>>>>>>>       HashMap newcopy = new HashMap( configuration ):
>>>>>>>       newcopy.put(..);
>>>>>>>       configuration = newcopy;
>>>>>>> }
>>>>>>> 
>>>>>>> Of course that is never going to scale for writes, but if
>>    writes stop
>>>>>>> at runtime after all services are started I would expect that the
>>>>>>> simplicity of the non-threadsafe HashMap should have some
>>    benefit over
>>>>>>> CHM{whatever}, or it would have been removed already?
>>>>>>> 
>>>>>> 
>>>>>> Right, we should be able to tell whether that's worth doing
>>    with a pure read
>>>>>> test with a CHMV8 and a HashMap :)
>>>>> 
>>>>> IFF you find out CHMV8 is as good as HashMap for read only, you
>>    have
>>>>> two options:
>>>>>   - ask the JDK team to drop the HashMap code as it's no
>>    longer needed
>>>>>   - fix your benchmark :-P
>>>>> 
>>>>> In other words, I'd consider it highly surprising and suspicious
>>>>> (still interesting though!)
>>>> 
>>>> It's not as surprising as you think.  On x86, volatile reads are the
>>>> same as regular reads (not counting some possible reordering
>>    magic).  So
>>>> if a CHM read is a hash, an array access, and a list traversal,
>>    and so
>>>> is HM (and I believe this is true though I'd have to review the code
>>>> again to be sure), I'd expect very similar execution performance on
>>>> read.  I think some of the anti-collision features in V8 might
>>    come into
>>>> play under some circumstances though which might affect
>>    performance in a
>>>> negative way (wrt the constant big-O component) but overall in a
>>>> positive way (by turning the linear big-O component into a
>>    logarithmic one).
>>> 
>>> Thanks David. I know about the cost of a volatile read, what I'm
>>    referring to
>>> is that I would expect the non-concurrent Maps to generally
>>    contain some
>>> simpler code than a conccurrent one. If this was not the case,
>>> why would any JDK team maintain two different implementations?
>>> That's why I would consider it surprising if it turned out that
>>    the CHMV8 was
>>> superior over a regular one on all fronts: there certainly is some
>>> scenario in which the regular one would be a more appropriate choice,
>>> which directly proofs that blindly replacing all usages in a
>>    large project
>>> is not optimal. Of course, it might be close to optimal..
>> 
>>    You are right, it is not superior on all fronts.  It is definitely
>>    similar in terms of read, but writes will have a substantially higher
>>    cost, involving (at the very least) multiple volatile writes which are
>>    orders of magnitude more expensive than normal writes (on Intel they
>>    have the costly impact of memory fence instructions).  So I don't think
>>    anyone will want to drop HashMap any time soon. :-)
>> 
>>    --
>>    - DML
>>    _______________________________________________
>>    infinispan-dev mailing list
>>    infinispan-dev at lists.jboss.org <mailto:infinispan-dev at lists.jboss.org>
>>    https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> 
>> 
>> 
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
> 
> 
> -- 
> - DML
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


--
Galder Zamarreño
galder at redhat.com
twitter.com/galderz

Project Lead, Escalante
http://escalante.io

Engineer, Infinispan
http://infinispan.org




More information about the infinispan-dev mailing list