Mark Proctor wrote
The getResults()
would probably have to change from...
public Object getResult(Serializable context) throws Exception {
CollectListData data = (CollectListData) context;
return Collections.unmodifiableSet( data.map.keySet() );
}
...to...
public Object getResult(Serializable context) throws Exception {
CollectListData data = (CollectListData) context;
return Collections.unmodifiableSet( new
HashSet(data.map.keySet())
);
}
The proposed getResult() results in a full HashSet copy for each change of
Rete, that performance hit would be unbearable for most people.
Well, I did not the concern in my original post, but I think it's a concern
that really should be tested. If it was "unbearable", then presumably no one
would ever use the current CollectListAccumulateFunction implementation,
because it does an O(N) operation for each reverse() on the list, not just
in the getResults() call at the end, which is presumably called no more
often.
I suspect that any performance degradation in your tests would be within the
error bar of the tests. I put together a quick JUnit:
@Test
public void testHashSet() {
Map<Integer, Integer> map = new IdentityHashMap<Integer,Integer>();
for (int i = 0; i < 100; i++) {
map.put(i, i);
}
long startMillis = System.currentTimeMillis();
for (int i = 0; i < 10000; i++) {
Set<Integer> set = new HashSet<Integer>(map.keySet());
}
long endMillis = System.currentTimeMillis();
System.out.println(endMillis - startMillis);
}
On my machine, running 10,000 new HashSet() operations on the key set of an
IdentityHashMap with 100 keys takes about 100 milliseconds. I don't have
much familiarity with the Drools test cases, the average customer's rule
base, nor the prevalence in those rule bases of using collectSet(). But,
that sounds like a small performance concern to me.
--
View this message in context:
http://drools.46999.n3.nabble.com/BUG-5-3-0-Final-CollectSetAccumulateFun...
Sent from the Drools: Developer (committer) mailing list mailing list archive at
Nabble.com.