Re: [rules-dev] BUG: [5.3.0.Final] CollectSetAccumulateFunction should probably use IdentityHashMap internally

Thursday, 1 March 2012

IdentityHashMap is kinda interesting, I thought the same when I first 
looked at it.

It doesnt' create Entry buckets to store hash clashes. It instead just 
moves to the next free space for two elements in the array and adds the 
adds the key, followed by the value in the next array element . So in 
theory it's GC free,  but if the map is too densely populated it'll have 
a longer search for free space.

Mark
On 29/02/2012 22:59, SirMungus wrote:
...
 laune wrote
> Here are some first results from comparing collectSet (as in 5.3.0) to
> a similar accumulate function using an IdentitiyHashMap&lt;Object,Void&gt;
> in its context and returns a true Set as a of the key set as its
> result. The reported times are the elapsed times of firing a single
> rule doing an accumulate into a Set from all String objects in WM.
>
 Out of curiosity, I did a quick timing test of repeated accumulate() calls
 (without getResults() calls) using IdentityHashMap vs. HashMap using your
 "s##" strings. While there should have been no particular reason for the
 performance to differ between the two (indeed, one would think identity
 would be faster because the identity hash function is natively implemented
 and presumably much faster), I consistently found the IdentityHashMap to
 take twice the time for its put() calls.  Looking at the source for the two
 classes, I was surprised to see that they have completely separate
 implementations.

 That's pretty astonishing. I would have figured that HashMap would have
 protected methods like computeHash(key) and testEquals(key1,key2) and that
 those would be overridden in an IdentityHashMap to use
 System.identityHashCode() and ==, rather than hashCode() and equals().

 My best guess is that someone has spent more time optimizing HashMap than
 IdentityHashMap.

 So, there would undoubtedly be a performance impact, even not taking into
 account the "new HashSet()" in the getResults() call. Whether it would be
 significant enough in real world scenarios to warrant a much more complex
 implementation requires more knowledge of Drools's user base than I possess,
 for sure. My guess is that Wolfgang's second scenario above is generally
 unrealistic. But, if it is routine for rules with collectSet() to end up
 with sets involving thousands of objects like the second scenario, that
 could certainly be an issue.

 I will try to get the requested JIRA in soon, but I won't be able to give
 much attention to this until maybe a week from now.  I've fallen a bit
 behind because of all this. :)

 --
 View this message in context:
http://drools.46999.n3.nabble.com/BUG-5-3-0-Final-CollectSetAccumulateFun...
 Sent from the Drools: Developer (committer) mailing list mailing list archive at
Nabble.com.
 _______________________________________________
 rules-dev mailing list
 rules-dev(a)lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/rules-dev 

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

Re: [rules-dev] BUG: [5.3.0.Final] CollectSetAccumulateFunction should probably use IdentityHashMap internally