[rules-dev] BUG: [5.3.0.Final] CollectSetAccumulateFunction should probably use IdentityHashMap internally

SirMungus Patrick_Rusk at ssga.com
Tue Feb 28 11:42:10 EST 2012


Mark Proctor wrote
> 
> The getResults()
> would probably have to change from...
> 
>      public Object getResult(Serializable context) throws Exception {
>          CollectListData data = (CollectListData) context;
>          return Collections.unmodifiableSet( data.map.keySet() );
>      }
> 
> ...to...
> 
>      public Object getResult(Serializable context) throws Exception {
>          CollectListData data = (CollectListData) context;
>          return Collections.unmodifiableSet( new
> HashSet(data.map.keySet())
> );
>      }
> 
> The proposed getResult() results in a full HashSet copy for each change of
> Rete, that performance hit would be unbearable for most people.
> 
Well, I did not the concern in my original post, but I think it's a concern
that really should be tested. If it was "unbearable", then presumably no one
would ever use the current CollectListAccumulateFunction implementation,
because it does an O(N) operation for each reverse() on the list, not just
in the getResults() call at the end, which is presumably called no more
often.
I suspect that any performance degradation in your tests would be within the
error bar of the tests. I put together a quick JUnit:
    @Test
    public void testHashSet() {
        Map<Integer, Integer> map = new IdentityHashMap<Integer,Integer>();
        for (int i = 0; i < 100; i++) {
            map.put(i, i);
        }
        long startMillis = System.currentTimeMillis();
        for (int i = 0; i < 10000; i++) {
            Set<Integer> set = new HashSet<Integer>(map.keySet());
        }
        long endMillis = System.currentTimeMillis();
        System.out.println(endMillis - startMillis);
    }
On my machine, running 10,000 new HashSet() operations on the key set of an
IdentityHashMap with 100 keys takes about 100 milliseconds. I don't have
much familiarity with the Drools test cases, the average customer's rule
base, nor the prevalence in those rule bases of using collectSet(). But,
that sounds like a small performance concern to me.

--
View this message in context: http://drools.46999.n3.nabble.com/BUG-5-3-0-Final-CollectSetAccumulateFunction-should-probably-use-IdentityHashMap-internally-tp3774079p3784578.html
Sent from the Drools: Developer (committer) mailing list mailing list archive at Nabble.com.


More information about the rules-dev mailing list