[rules-dev] BUG: [5.3.0.Final] CollectSetAccumulateFunction should probably use IdentityHashMap internally

Mark Proctor mproctor at codehaus.org
Thu Mar 1 01:26:09 EST 2012


IdentityHashMap is kinda interesting, I thought the same when I first 
looked at it.

It doesnt' create Entry buckets to store hash clashes. It instead just 
moves to the next free space for two elements in the array and adds the 
adds the key, followed by the value in the next array element . So in 
theory it's GC free,  but if the map is too densely populated it'll have 
a longer search for free space.

Mark
On 29/02/2012 22:59, SirMungus wrote:
> laune wrote
>> Here are some first results from comparing collectSet (as in 5.3.0) to
>> a similar accumulate function using an IdentitiyHashMap<Object,Void>
>> in its context and returns a true Set as a of the key set as its
>> result. The reported times are the elapsed times of firing a single
>> rule doing an accumulate into a Set from all String objects in WM.
>>
> Out of curiosity, I did a quick timing test of repeated accumulate() calls
> (without getResults() calls) using IdentityHashMap vs. HashMap using your
> "s##" strings. While there should have been no particular reason for the
> performance to differ between the two (indeed, one would think identity
> would be faster because the identity hash function is natively implemented
> and presumably much faster), I consistently found the IdentityHashMap to
> take twice the time for its put() calls.  Looking at the source for the two
> classes, I was surprised to see that they have completely separate
> implementations.
>
> That's pretty astonishing. I would have figured that HashMap would have
> protected methods like computeHash(key) and testEquals(key1,key2) and that
> those would be overridden in an IdentityHashMap to use
> System.identityHashCode() and ==, rather than hashCode() and equals().
>
> My best guess is that someone has spent more time optimizing HashMap than
> IdentityHashMap.
>
> So, there would undoubtedly be a performance impact, even not taking into
> account the "new HashSet()" in the getResults() call. Whether it would be
> significant enough in real world scenarios to warrant a much more complex
> implementation requires more knowledge of Drools's user base than I possess,
> for sure. My guess is that Wolfgang's second scenario above is generally
> unrealistic. But, if it is routine for rules with collectSet() to end up
> with sets involving thousands of objects like the second scenario, that
> could certainly be an issue.
>
> I will try to get the requested JIRA in soon, but I won't be able to give
> much attention to this until maybe a week from now.  I've fallen a bit
> behind because of all this. :)
>
> --
> View this message in context: http://drools.46999.n3.nabble.com/BUG-5-3-0-Final-CollectSetAccumulateFunction-should-probably-use-IdentityHashMap-internally-tp3774079p3788946.html
> Sent from the Drools: Developer (committer) mailing list mailing list archive at Nabble.com.
> _______________________________________________
> rules-dev mailing list
> rules-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/rules-dev



More information about the rules-dev mailing list