[rules-dev] BUG: [5.3.0.Final] CollectSetAccumulateFunction should probably use IdentityHashMap internally
Wolfgang Laun
wolfgang.laun at gmail.com
Wed Feb 29 03:41:40 EST 2012
Here are some first results from comparing collectSet (as in 5.3.0) to
a similar accumulate function using an IdentitiyHashMap<Object,Void>
in its context and returns a true Set as a of the key set as its
result. The reported times are the elapsed times of firing a single
rule doing an accumulate into a Set from all String objects in WM.
(1) An extreme case: insert many different String facts and collect
them into a single set. Strings were composed by concatenating "s" and
i = 0(1)N-1
// different Strings
// collectSet
// N= 10000 2.542s
// N= 20000 11.118s
// N= 100000 198.839s
// ----------------
// collectIdentity
// N= 10000 3.944s
// N= 20000 18.514s
// N= 100000 832.093s
// ===============
(2) Another extreme case: insert many equal String facts as distinct
objects and collect them into a single set. String value "s12345".
// equal Strings
// collectSet
// N= 10000 0.419s
// N= 20000 0.558s
// N= 100000 0.816s
// ---------------
// collectIdentity
// N= 10000 2.940s
// N= 20000 13.048s
// N= 100000 522.286s
(It's faster due to not having to construct the string.)
It's obvious that the continuing growth of a HashMap with its calls to
resize for increasing capacities of 16, 32, 64,... and the rehashing
of all entries is costly.
If your final Set is just a small portion of a (large number) of
collected elements, maintaining the IdentityHashMap is costly whereas
the true set of objects remains at minimum size throughout.
-W
On 28 February 2012 20:39, Wolfgang Laun <wolfgang.laun at gmail.com> wrote:
>
> I have cloned it and running benchmarks on it. Might not be able to complete it tonight.
> -W
>
>
> On 28 February 2012 19:55, Edson Tirelli <ed.tirelli at gmail.com> wrote:
>>
>>
>> I am sure he would not deny that... :) you are helping to make the world a better place... :)
>>
>> In any case, please open a JIRA so that this does not get lost in the e-mails. It is a low hanging fruit, very isolated and simple to fix. So anyone looking for a starting work to contribute to the project can do it. Otherwise I will do it when I get the time.
>>
>> Thanks,
>> Edson
>>
>> On Tue, Feb 28, 2012 at 1:45 PM, SirMungus <Patrick_Rusk at ssga.com> wrote:
>>>
>>>
>>> Edson Tirelli-4 wrote
>>> >
>>> > I see what you are saying and I think it works. Although, it will be
>>> > heavier, it would cover scenarios not covered by the current
>>> > implementation.
>>> >
>>> > My suggestion is if you can, please create a test case for the problem,
>>> > change the collectSet accumulate function to do what you propose and send
>>> > a
>>> > pull request. I will review it and apply. I think in this case, making
>>> > user's life easier by not requiring immutability might trump performance,
>>> > but we need to evaluate some tests just in case.
>>> >
>>> Edson, thanks for the offer. I'll see what I can do. Since I work at a
>>> financial services company, I practically have to get Obama's permission to
>>> do anything official on an open source project. :)
>>>
>>> --
>>> View this message in context: http://drools.46999.n3.nabble.com/BUG-5-3-0-Final-CollectSetAccumulateFunction-should-probably-use-IdentityHashMap-internally-tp3774079p3784963.html
>>> Sent from the Drools: Developer (committer) mailing list mailing list archive at Nabble.com.
>>> _______________________________________________
>>> rules-dev mailing list
>>> rules-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/rules-dev
>>
>>
>>
>>
>> --
>> Edson Tirelli
>> JBoss Drools Core Development
>> JBoss by Red Hat @ www.jboss.com
>>
>> _______________________________________________
>> rules-dev mailing list
>> rules-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/rules-dev
>>
>
More information about the rules-dev
mailing list