Re: [rules-users] Performance scaling with fact insertion

Thursday, 9 February 2012

You realize that the modify negating part of the condition must result in
an immediate retraction of the logical insertion?

During the runs: were there any other rules besides the one you have shown,
especially rules with patterns using AnomalyFact or DataPoint?

AnomalyType =/= AnomalyEnum

-W

On 08/02/2012, St. Lawrence, Zachary <zstlawre(a)akamai.com&gt; wrote:
...
 I was running some performance analysis on my rules so I could tune
in hopes
 of scaling up from hundreds of thousands of datapoints to millions of
 entries and noticed some alarming performance scaling issues when rules are
 firing and inserting dependent objects.  At first I thought it was one of my
 more complex rules but when I broke down my examples to the simplest
 possible test cases I found that the central problem was that performance
 was not linear with number of facts inserted -- it was polynominal (roughly
 n^2.5).

 My simplest test example was basically each fact inserted initially would
 trigger a new logical relationship fact to be inserted.  No fancy logic.
 Also it could not trigger itself recursively.  I tried three approaches :
 logical inserts of new facts, manual inserts/retraction, and update of the
 current fact with new data.

 Updates on current fact were roughly linear assuming I prevented re-firing.
 Manual Inserts and retracts scaled with O(n^2.5)
 LogicalInserts were O(n^3)

 The documentation suggest insertion of dependent facts to simplify logic.
 But when dealing with tens of thousands of events triggering it did not
 scale sufficiently.  Can anyone tell me what I may have done wrong?  I have
 included rules and data below.

 This was on 5.3 using fireAllRules.  The time measured was rule execution
 only and excluded times due to my testing framework.

 // SIMPLIFIED RULES FILE
 declare DataPoint
 anomaly : boolean
 entityId : String
 average : double
 predict : double
 predictDev : double
 timestamp : long
 end

 declare AnomalyFact
 entityId : String
 anomalyType : AnomalyType
 ruleName : String
 timestamp : long
 end

 rule "Traffic Below 4 Deviations"
 no-loop
     when
 $datapoint : DataPoint (
 anomaly==false,
 average < (predict - (predictDev * 4))
 )
     then
 // update self only. This line was always active.
     modify($datapoint) {setAnomaly(true)};

 // on logicalInsert test I commented in the following line
 //     insertLogical( new AnomalyFact ( $datapoint.getEntityId(),
 AnomalyEnum.LOW, drools.getRule().getName(), $datapoint.getTimestamp()));

 // on insert test I commented in the following line and added another rule
 to retract any anomaly with no datapoint
 //     insert( new AnomalyFact ( $datapoint.getEntityId(), AnomalyEnum.LOW,
 drools.getRule().getName(), $datapoint.getTimestamp()));
 end

 /* TEST RESULT CONTAINED BELOW
 All data inserted triggered the rule.  I was load testing for worst case
 behavior.

 logicalInsert
 n run1 run2 run3
 5000 1293 1232 1767
 10000 9458 8210 10079
 15000 24389 31050 24311
 30000 134936

 manual insert/retract
 n run1 run2 run3
 5000 1137 1094 1169
 10000 8090 5862 5616
 15000 16741 16728 15874
 30000 76034

 update on current object only
 n run1 run2 run3
 5000 417 444 413
 10000 600
 15000 712 697 688
 30000 999 970 1041 1016 1163
 60000 1420
 100000 1930

 Thanks for any help!
 */

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [rules-users] Performance scaling with fact insertion