[rules-dev] BUG: [5.3.0.Final] CollectSetAccumulateFunction should probably use IdentityHashMap internally

SirMungus Patrick_Rusk at ssga.com
Wed Feb 29 17:59:06 EST 2012


laune wrote
> 
> Here are some first results from comparing collectSet (as in 5.3.0) to
> a similar accumulate function using an IdentitiyHashMap<Object,Void>
> in its context and returns a true Set as a of the key set as its
> result. The reported times are the elapsed times of firing a single
> rule doing an accumulate into a Set from all String objects in WM.
> 
Out of curiosity, I did a quick timing test of repeated accumulate() calls
(without getResults() calls) using IdentityHashMap vs. HashMap using your
"s##" strings. While there should have been no particular reason for the
performance to differ between the two (indeed, one would think identity
would be faster because the identity hash function is natively implemented
and presumably much faster), I consistently found the IdentityHashMap to
take twice the time for its put() calls.  Looking at the source for the two
classes, I was surprised to see that they have completely separate
implementations.

That's pretty astonishing. I would have figured that HashMap would have
protected methods like computeHash(key) and testEquals(key1,key2) and that
those would be overridden in an IdentityHashMap to use
System.identityHashCode() and ==, rather than hashCode() and equals().

My best guess is that someone has spent more time optimizing HashMap than
IdentityHashMap.

So, there would undoubtedly be a performance impact, even not taking into
account the "new HashSet()" in the getResults() call. Whether it would be
significant enough in real world scenarios to warrant a much more complex
implementation requires more knowledge of Drools's user base than I possess,
for sure. My guess is that Wolfgang's second scenario above is generally
unrealistic. But, if it is routine for rules with collectSet() to end up
with sets involving thousands of objects like the second scenario, that
could certainly be an issue.

I will try to get the requested JIRA in soon, but I won't be able to give
much attention to this until maybe a week from now.  I've fallen a bit
behind because of all this. :)

--
View this message in context: http://drools.46999.n3.nabble.com/BUG-5-3-0-Final-CollectSetAccumulateFunction-should-probably-use-IdentityHashMap-internally-tp3774079p3788946.html
Sent from the Drools: Developer (committer) mailing list mailing list archive at Nabble.com.


More information about the rules-dev mailing list