RE: [rules-users] Drools 4 poor performance scaling?

Monday, 30 June 2008

I'm in IRC now.

The non-business sensitive test case hasn't been maintained.  At this
stage it might be pretty difficult to create one that doesn't have
proprietary information and still functions anywhere the same.  We've
got nearly 200 rules and 20 different kinds of facts.  I wonder if a
simple obfuscation would be sufficient?

I did give 5.0M1 a try last week.  Several of our rules wouldn't
compile.  I tried for a day or so to fix things, but then gave up.  We
know it is non-optimal, but we have a few rules with "if" statements in
the RHS and those simply wouldn't compile in 5.0.  I'd like to refactor
those out to at least an "eval" in the LHS, but ideally I'd like to
precompute the statement and store the result in a new fact so that it
could be indexed.

Is 5.0 better for multi-threaded access as we discussed before?  We've
had to wrap all access to working memory in synchronized blocks when
using 4.x.  That's a pretty big hammer, but it works.  Otherwise fact
insertions/retracts, firing of rules and queries end up getting run at
the same time by different threads and working memory ends up completely
unusable.

Maybe I'll take another stab at fixing those rules and give 5.0 another
go.  Any target on a 5.0 release date?  We're looking to go live in
production in about 1 month.

-----Original Message-----
From: rules-users-bounces(a)lists.jboss.org
[mailto:rules-users-bounces@lists.jboss.org] On Behalf Of Mark Proctor
Sent: Monday, June 30, 2008 12:39 PM
To: Rules Users List
Subject: Re: [rules-users] Drools 4 poor performance scaling?

Fenderbosch, Eric wrote:
...
 We are having a similar problem, although our fact count is much
higher.
...
 Performance seems pretty good and consistent until about 400k facts,

 then performance degrades significantly.  Part of the degradation is 
 from bigger and more frequent GCs, but not all of it.
    If you have multi-cpu there is a JVM command you can set a dedicated cpu
for GC, that helps somewhat.
...
 Time to load first 100k facts: ~1 min
 Time to load next 100k facts: ~1 min
 Time to load next 100k facts: ~2 min
 Time to load next 100k facts: ~4 min

 This trend continues, going from 600k to 700k facts takes over 7 
 minutes.  We're running 4.0.7 on a 4 CPU box with 12 GB, 64 bit RH 
 Linux and 64 bit JRockit 5.  We've allocated a 9 GB heap for the VM 
 using large pages, so no memory paging is happening.  JRockit is 
 started w/ the -XXagressive parameter, which enables large pages and 
 the more efficient hash function in HashMap which was introduced in 
 Java5 update 8.
    Other than the CPU thing, Drools won't take advantage of multipe cpus at
the moment.
...
 http://e-docs.bea.com/jrockit/jrdocs/refman/optionXX.html

 The end state is over 700k facts, with the possibility of nearly 1M 
 facts in production.  After end state is reached and we issue a few GC 
...
 requests, if looks like our memory per fact is almost 9k, which seems

 quite high as most of the facts are very simple.  Could that be due to 
...
 our liberal use of insertLogical and TMS?
    It could be related to this, especially if you create a long chain of
logical relationships.
...
 We've tried performing a "commit" every few hundred
fact insertions by 
...
 issuing a fireAllRules periodically, and that seems to have helped 
 marginally.

 I tried disabling shadow proxies and a few of our ~390 test cases fail 
...
 and one loops indefinitely.  I'm pretty sure we could fix those,
but 
 don't want to bother if this isn't a realistic solution.

 Any thoughts?
    Have you tried this on Drools 5.0? It 'doesn't need shadow proxies and
implements a new Rete algorithm that is faster for retracts. You can get
a nightly build from here, I'd be interested to find out how broken 5.0
is :)
https://hudson.jboss.org/hudson/job/drools/lastSuccessfulBuild/artifact/
trunk/target/

We still have more performnace work to do, the items are known, just a
matter of time, not all will make 5.0 though. but the main items
include:
1) bytecode compiled Rete network, instead of interpreted nodes. I'm
hoping this will have a large impact, reducing GC and general
indirection and recursive method call frames.
2) "true modify", instead of a retract+assert, will also remove the need
for activation normalistaion that we do for TMS and the agenda event
model.
3) range indexing (initially literals, but would like to explore
variables too).

Steve, before he left fedex, was creating a simulator for this use case,
but removing anything business sensitive. So that we could use it
publicly as a benchmark and to help us tune the engine. Are you still
working on this? Steve use to chat to us on irc, can I ask you to pop on
for a chat?
http://labs.jboss.org/drools/irc.html

mark
...
 Thanks

 Eric

 -----Original Message-----
 From: rules-users-bounces(a)lists.jboss.org
 [mailto:rules-users-bounces@lists.jboss.org] On Behalf Of Ron Kneusel
 Sent: Thursday, June 26, 2008 12:47 PM
 To: rules-users(a)lists.jboss.org
 Subject: [rules-users] Drools 4 poor performance scaling?

 I am testing Drools 4 for our application and while sequential mode is 
...
 very fast I get very poor scaling when I increase the number of facts

 for stateful or stateless sessions.  I want to make sure I'm not doing 
...
 something foolish before deciding on whether or not to use Drools 
 because from what I am reading online it should be fast with the 
 number of facts I have.

 The scenario:  I have 1000 rules in a DRL file.  They are all of the
 form:

 rule rule0000
     when 
         Data(type == 0, value> 0.185264);
         Data(type == 3, value < 0.198202);
     then 
         insert(new AlarmRaised(0));
         warnings.setAlarm(0, true);
 end

 where the ranges checked on the values and the types are randomly 
 generated.  Then, I create a Stateful session and run in a loop timing 
...
 how long it takes the engine to fire all rules as the number of 
 inserted facts increases:

         //  Run 
         for(j=0; j < 100; j+=5) {

             if (j==0) {
                 nfacts = 1;
             } else {
                 nfacts = j;
             }

             System.out.println(nfacts + ":");

             //  Get a working memory
             StatefulSession wm = ruleBase.newStatefulSession();

             //  Global - output
             warnings = new Alarm();
             wm.setGlobal("warnings", warnings);

             //  Add facts
             st = (new Date()).getTime();
             for(i=0; i < nfacts; i++) {
                 wm.insert(new Data(rand.nextInt(4), 
 rand.nextDouble()-0.5));
             }
             en = (new Date()).getTime();
             System.out.println("    facts = " + (en-st));

             //  Now run the rules
             st = (new Date()).getTime();
             wm.fireAllRules();
             en = (new Date()).getTime();
             System.out.println("    rules = " + (en-st));

             //  Clean up
             wm.dispose();

             System.out.println("\n");
         }

 This code is based on the HelloWorldExample.java code from the manual 
 and the setup for the rule base is the same as in the manual.  As the 
 number of facts increases runtime increases dramatically:

 facts -- runtime (ms)
 10 -- 168
 20 -- 166
 30 -- 344
 40 -- 587
 50 -- 1215
 60 -- 1931
 70 -- 2262
 80 -- 3000
 90 -- 4754

 with a maximum memory use of about 428 MB RAM.  By contrast, if I use 
 sequential stateless sessions, everything runs in about 1-5 ms.

 Is there something in my set up that would cause this, or is this how 
 one would expect Drools to scale?  I read about people using thousands 
...
 of facts so I suspect I'm setting something up incorrectly.

 Any help appreciated!

 Ron

 _________________________________________________________________
 The other season of giving begins 6/24/08. Check out the i'm Talkathon.
...
 http://www.imtalkathon.com?source=TXT_EML_WLH_SeasonOfGiving
 _______________________________________________
 rules-users mailing list
 rules-users(a)lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/rules-users

 _______________________________________________
 rules-users mailing list
 rules-users(a)lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/rules-users

_______________________________________________
rules-users mailing list
rules-users(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

RE: [rules-users] Drools 4 poor performance scaling?