[rules-users] Performance scaling with fact insertion

Thu Feb 9 01:57:04 EST 2012

You realize that the modify negating part of the condition must result in
an immediate retraction of the logical insertion?

During the runs: were there any other rules besides the one you have shown,
especially rules with patterns using AnomalyFact or DataPoint?

AnomalyType =/= AnomalyEnum

-W

On 08/02/2012, St. Lawrence, Zachary <zstlawre at akamai.com> wrote:
> I was running some performance analysis on my rules so I could tune in hopes
> of scaling up from hundreds of thousands of datapoints to millions of
> entries and noticed some alarming performance scaling issues when rules are
> firing and inserting dependent objects.  At first I thought it was one of my
> more complex rules but when I broke down my examples to the simplest
> possible test cases I found that the central problem was that performance
> was not linear with number of facts inserted -- it was polynominal (roughly
> n^2.5).
>
> My simplest test example was basically each fact inserted initially would
> trigger a new logical relationship fact to be inserted.  No fancy logic.
> Also it could not trigger itself recursively.  I tried three approaches :
> logical inserts of new facts, manual inserts/retraction, and update of the
> current fact with new data.
>
> Updates on current fact were roughly linear assuming I prevented re-firing.
> Manual Inserts and retracts scaled with O(n^2.5)
> LogicalInserts were O(n^3)
>
> The documentation suggest insertion of dependent facts to simplify logic.
> But when dealing with tens of thousands of events triggering it did not
> scale sufficiently.  Can anyone tell me what I may have done wrong?  I have
> included rules and data below.
>
> This was on 5.3 using fireAllRules.  The time measured was rule execution
> only and excluded times due to my testing framework.
>
> // SIMPLIFIED RULES FILE
> declare DataPoint
> anomaly : boolean
> entityId : String
> average : double
> predict : double
> predictDev : double
> timestamp : long
> end
>
> declare AnomalyFact
> entityId : String
> anomalyType : AnomalyType
> ruleName : String
> timestamp : long
> end
>
> rule "Traffic Below 4 Deviations"
> no-loop
>     when
> $datapoint : DataPoint (
> anomaly==false,
> average < (predict - (predictDev * 4))
> )
>     then
> // update self only. This line was always active.
>     modify($datapoint) {setAnomaly(true)};
>
> // on logicalInsert test I commented in the following line
> //     insertLogical( new AnomalyFact ( $datapoint.getEntityId(),
> AnomalyEnum.LOW, drools.getRule().getName(), $datapoint.getTimestamp()));
>
>
> // on insert test I commented in the following line and added another rule
> to retract any anomaly with no datapoint
> //     insert( new AnomalyFact ( $datapoint.getEntityId(), AnomalyEnum.LOW,
> drools.getRule().getName(), $datapoint.getTimestamp()));
> end
>
>
> /* TEST RESULT CONTAINED BELOW
> All data inserted triggered the rule.  I was load testing for worst case
> behavior.
>
> logicalInsert
> n run1 run2 run3
> 5000 1293 1232 1767
> 10000 9458 8210 10079
> 15000 24389 31050 24311
> 30000 134936
>
> manual insert/retract
> n run1 run2 run3
> 5000 1137 1094 1169
> 10000 8090 5862 5616
> 15000 16741 16728 15874
> 30000 76034
>
> update on current object only
> n run1 run2 run3
> 5000 417 444 413
> 10000 600
> 15000 712 697 688
> 30000 999 970 1041 1016 1163
> 60000 1420
> 100000 1930
>
> Thanks for any help!
> */
>