<div dir="ltr"><div style><br></div><div style>The previous rule looked something like this (simplified version):<br></div><div><br></div>rule "Poor Performing Rule"<div>agenda-group "Summarize"<br><div style>
when</div><div style> $a : A( …, $category : category )<br></div><div style> $all : List() from collect( A( category == $category )</div><div style>then</div><div style> modify( $a ) { updateSummary( …, $all ) }</div>
<div style><br></div><div style>In this case "collect" is used is to generate a list of all A facts which need to be used in "updateSummary". This performed poorly because A facts are inserted and modified frequently in earlier agenda-groups. I now appreciate agenda-groups only affect when a rule fires... Not when the rule is evaluated! Since facts that match $a are changed frequently the collect( … ) was evaluated A LOT and caused poor performance. <br>
</div><div style><br></div><div style>The ruleset I am working on has a concept of grouping facts into "buckets" of other facts with similar qualities. So I add a new rule to keep track of A facts:<br></div><div style>
<br></div><div style>rule "Add A Facts into Bucket by Category"</div><div style>agenda-group "GroupThatComesBeforeSummarize"<br></div><div style>no-loop</div><div style><div>when</div><div> $a : A( …, $category : category )</div>
<div style> $bucket : Bucket( … )</div><div>then</div><div> modify( $bucket ) { add( $category, $a ) }</div><div><br></div><div style>Then I modified the original rule to use this Bucket( … ) instead of collect( .. ) :</div>
<div><br></div><div><div>rule "Better Performing Rule"</div><div>agenda-group "Summarize"<br></div><div>when<br></div><div><div> $a : A( …, $category : category )</div><div> $bucket : Bucket( … )</div>
<div>then</div><div> modify( $a ) { updateSummary( …, $bucket.getAll( $category) ) }</div></div></div><div><br></div><div style>The two rules need to be in different agenda-groups to ensure all of the A(…) objects are added to the bucket before the bucket is used to generate summaries. </div>
<div style><br></div><div style><br></div><div style>Hope this helps someone.</div><div style><br></div><div style>Ryan</div></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Tue, Jan 15, 2013 at 11:03 AM, Geoffrey De Smet <span dir="ltr"><<a href="mailto:ge0ffrey.spam@gmail.com" target="_blank">ge0ffrey.spam@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
Interesting!<br>
<br>
Might you have a code example how your code looks like before and
after these changes?<br>
<br>
<div>Op 15-01-13 16:27, Ryan Crumley
schreef:<br>
</div><div><div class="h5">
<blockquote type="cite">
<div dir="ltr">Update:
<div><br>
</div>
<div>Turns out I was looking in the wrong place. All
along I had been looking for a rule with accumulate based on
AccumulateNode showing up in the stack trace. By setting a
conditional breakpoint in AccumulateNode that only breaks when
the result is AbstractList (I knew AbstractList.hashCode was
causing the performance problem) I was able to find the rule
that caused the problem...</div>
<div><br>
</div>
<div>The problem rule actually used "collect" not
"accumulate"! Appears that "collect" is implemented using
"accumulate" under the covers. This explains why I had trouble
narrowing down my search.</div>
<div><br>
</div>
<div>In the problem rule "collect" was the last
condition in WHEN. Additionally the first condition matches a
fact that is not inserted until late in rule processing. It
was clear from my investigation the work associated with
"collect" was happening much more frequently than I had
expected (maybe even as often as every fact inserted). <br>
</div>
<div><br>
</div>
<div>Once I found the problem rule it was
straightforward to accomplish the same effect without a
"collect". In the heavy workload use case (hundreds of
thousand facts) this resulted in a 99% performance
improvement. <br>
</div>
<div><br>
</div>
<div>I have not had the chance to create a simple rule
set to reproduce the problem so I can understand WHY this rule
was so detrimental to performance. For now I am happy
reporting to my team the performance issue is fixed and to be
very careful when using collect and accumulate in the future. </div>
<div><br>
</div>
<div>Thanks for the pointers.</div>
<div> </div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Wed, Jan 9, 2013 at 10:38 AM, Ryan
Crumley <span dir="ltr"><<a href="mailto:crumley@gmail.com" target="_blank">crumley@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">Thanks Geoffrey.
<div><br>
</div>
<div>I have a few rules that have two accumulates.
Upgrading to 5.5 shouldn't be a problem so I will give
that a try and see if it helps. Thanks for the tip!</div>
</div>
<div>
<div>
<div class="gmail_extra">
<br>
<br>
<div class="gmail_quote">On Wed, Jan 9, 2013 at 8:37
AM, Geoffrey De Smet <span dir="ltr"><<a href="mailto:ge0ffrey.spam@gmail.com" target="_blank">ge0ffrey.spam@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"> I 've also
seen that accumulate and statefull don't always
mix as good as they could.<br>
The new algorithm for 6.0 sounds promising to
improve this (with "set based propagation").<br>
<br>
Do you have any rule that has 2 accumulates in 1
rule?<br>
That used to kill my statefull performance (3
times slower etc), but a recent experiment with
5.5 showed that that's no longer the case IIRC.<br>
<br>
<div>Op 09-01-13 15:09, Ryan Crumley schreef:<br>
</div>
<blockquote type="cite">
<div>
<div>
<div dir="ltr">
<div>Hi,</div>
<div><br>
</div>
<div>I am investigating performance of a
Drools 5.4 stateful knowledge session.
This session has about 200 rules, 200k
facts and takes about 1 hour to run to
completion. Looking at the profile
there is a hotspot that consumes
almost 65% of the cpu time:
java.util.AbstractList.hashCode().</div>
<div><br>
</div>
<div>Here is the full stack: </div>
<div><br>
</div>
<div>
com.company.rules.engine.Rule_Set_weights_08b44ce519a74b58ab3f85735b2987cbDefaultConsequenceInvoker.evaluate(KnowledgeHelper,
WorkingMemory)</div>
<div>
com.company.rules.engine.Rule_Set_weights_08b44ce519a74b58ab3f85735b2987cbDefaultConsequenceInvokerGenerated.evaluate(KnowledgeHelper,
WorkingMemory)</div>
<div>
com.company.rules.engine.Rule_Set_weights_08b44ce519a74b58ab3f85735b2987cb.defaultConsequence(KnowledgeHelper,
List, FactHandle, GradingFact,
FactHandle, ReportNode, FactHandle,
WeightsHolder, FactHandle, Logger)</div>
<div>
org.drools.base.DefaultKnowledgeHelper.update(FactHandle,
long)</div>
<div>
org.drools.common.NamedEntryPoint.update(FactHandle,
Object, long, Activation)</div>
<div>
org.drools.common.NamedEntryPoint.update(FactHandle,
Object, long, Activation)</div>
<div>
org.drools.common.PropagationContextImpl.evaluateActionQueue(InternalWorkingMemory)</div>
<div>
org.drools.reteoo.ReteooWorkingMemory$EvaluateResultConstraints.execute(InternalWorkingMemory)</div>
<div>
org.drools.reteoo.AccumulateNode.evaluateResultConstraints(AccumulateNode$ActivitySource,
LeftTuple, PropagationContext,
InternalWorkingMemory,
AccumulateNode$AccumulateMemory,
AccumulateNode$AccumulateContext,
boolean)</div>
<div>
org.drools.common.DefaultFactHandle.setObject(Object)</div>
<div>
java.util.AbstractList.hashCode()</div>
<div><br>
</div>
<div>I believe the following clues can
be extracted:</div>
<div><br>
</div>
<div>- "Rule_Set_weights" was fired and
a fact was modified (confirmed by
examining the rule definition)</div>
<div>- The fact modification caused the
pre-conditions for other rules to be
computed.</div>
<div>- One of these rules has an
accumulate condition that accumulates
into an AbstractList.</div>
<div>- This list is very very large. So
large that looping through the
elements in the list and aggregating
the hashCode of individual elements
dominates execution time (the
individual element hashCode doesn't
even show up in the profile… either
its very fast or maybe its identify
hashCode which the profiler might
filter?). </div>
<div>- Accumulate is either working on a
large set of data or the same
accumulate is evaluated many many
times. </div>
<div><br>
</div>
<div>Is my analysis correct? Are there
clues that I am missing? </div>
<div><br>
</div>
<div> I have 15 rules that use
accumulate… However none accumulate
with a result of List. Most accumulate
using sum() and count() (result of
Number). A few use collectSet(). A few
more aggregate into a result with a
custom type.</div>
<div><br>
</div>
<div>A few other notes: </div>
<div>- All accumulate conditions are the
last condition in the WHEN clause. </div>
<div>- I use agenda groups to separate
fact processing into phases. Rules
that accumulate are in a separate
agenda group from rules that
modify/insert facts that are used in
accumulation. I hope this prevents the
accumulate condition from being
evaluated until all the rules that
modify the facts accumulate needs are
done firing. I suspect this may not be
working as I expect. I haven't put
together an example to investigate. </div>
<div>- When accumulating into a set, the
rule condition looks like this:</div>
<div> $factName : Set() from
accumulate( FactMatch( $field : field
), collectionSet( $field ) )</div>
<div><br>
</div>
<div>How can I narrow down this
further? </div>
<div><br>
</div>
<div>Are there any general rules to
follow to optimize use of accumulate
in conditions? </div>
<div><br>
</div>
<div>Thanks,</div>
<div><br>
</div>
<div>Ryan</div>
</div>
<br>
<fieldset></fieldset>
<br>
</div>
</div>
<pre>_______________________________________________
rules-users mailing list
<a href="mailto:rules-users@lists.jboss.org" target="_blank">rules-users@lists.jboss.org</a>
<a href="https://lists.jboss.org/mailman/listinfo/rules-users" target="_blank">https://lists.jboss.org/mailman/listinfo/rules-users</a>
</pre>
</blockquote>
<br>
</div>
<br>
_______________________________________________<br>
rules-users mailing list<br>
<a href="mailto:rules-users@lists.jboss.org" target="_blank">rules-users@lists.jboss.org</a><br>
<a href="https://lists.jboss.org/mailman/listinfo/rules-users" target="_blank">https://lists.jboss.org/mailman/listinfo/rules-users</a><br>
<br>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
rules-users mailing list
<a href="mailto:rules-users@lists.jboss.org" target="_blank">rules-users@lists.jboss.org</a>
<a href="https://lists.jboss.org/mailman/listinfo/rules-users" target="_blank">https://lists.jboss.org/mailman/listinfo/rules-users</a>
</pre>
</blockquote>
<br>
</div></div></div>
<br>_______________________________________________<br>
rules-users mailing list<br>
<a href="mailto:rules-users@lists.jboss.org">rules-users@lists.jboss.org</a><br>
<a href="https://lists.jboss.org/mailman/listinfo/rules-users" target="_blank">https://lists.jboss.org/mailman/listinfo/rules-users</a><br>
<br></blockquote></div><br></div></div>