[jboss-svn-commits] JBL Code SVN: r8338 - labs/jbossrules/trunk/documentation/manual/en/Chapter-Performance_Tuning

Thu Dec 14 22:03:31 EST 2006

Author: woolfel
Date: 2006-12-14 22:03:31 -0500 (Thu, 14 Dec 2006)
New Revision: 8338

Modified:
   labs/jbossrules/trunk/documentation/manual/en/Chapter-Performance_Tuning/Section-Performance.xml
Log:
I've added a section on large rulesets and some strategies for address the challenge.
peter

Modified: labs/jbossrules/trunk/documentation/manual/en/Chapter-Performance_Tuning/Section-Performance.xml
===================================================================

--- labs/jbossrules/trunk/documentation/manual/en/Chapter-Performance_Tuning/Section-Performance.xml	2006-12-15 02:44:35 UTC (rev 8337)
+++ labs/jbossrules/trunk/documentation/manual/en/Chapter-Performance_Tuning/Section-Performance.xml	2006-12-15 03:03:31 UTC (rev 8338)
@@ -1,4 +1,4 @@
-<?xml version="1.0" encoding="UTF-8"?>
+<?xml version="1.0" encoding="UTF-8"?>
 <!DOCTYPE section PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
 "http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
 <section>
@@ -159,4 +159,145 @@
     <para>Some other improvements are being developed for Drools in this area
     and will be documented as they become available in future versions.</para>
   </section>
+  
+  <section>
+    <title>Large Ruleset</title>
+    <para>For this section, large rulesets are define as the following</para>
+    <itemizedlist>
+    <listitem>1-500 - small ruleset</listitem>
+    <listitem>500-2000 - medium ruleset</listitem>
+    <listitem>2000+ - large ruleset</listitem>
+    <listitem>10,000 - extremely large ruleset</listitem>
+    </itemizedlist>
+    <para>There are some cases where a rule engine has to handle 500,000 or 1 million rules.
+    Those are primarily machine learning and AI systems, where a rule engine produces new
+    rules, terms and facts at execution time. Those topics are beyond the scope of the
+    documentation and aren't covered. The techniques described are focused on business rules.</para>
+    <para>The first thing to do is identify why there are so many rules and whether or not
+    rewriting the rules can solve the problem. There's a couple of things to look for.</para>
+    <itemizedlist>
+    <listitem>Do the rules have a lot of constant values hard coded in the conditions?</listitem>
+    <listitem>Is the domain model a huge flat spreadsheet with 100+ columns?</listitem>
+    <listitem>Do most of the rules share the same conditions?</listitem>
+    <listitem>Can the logic be divided into stages?</listitem>
+    </itemizedlist>
+    <para>If you answer yes to any of the 4 questions, chances are you can solve the issue with
+    changing the rules. Managing 100,000 rules or even 1,000,000 rules is a huge headache, so
+    try to avoid it. Examine the rules and see if it matches any of the following scenarios.</para>
+    <programlisting>
+If
+  customer.account == "abcd"
+  customer.type == "basic"
+  .....
+Then
+  // do something
+    </programlisting>
+    <para>The basic problem with rules sample above, is the rules have most of the values hard
+    coded. If the average customer has 50 rules and there's 40 million customers, the system has
+    200 million rules. Let's use a more concrete example to flesh this out.</para>
+    <programlisting>
+If
+  customer.accountId == "peter"
+  customer.type == "level2"
+  customer.favoriateActor == "jackie chan"
+Then
+  recommend movies with jackie chan
+
+If
+  customer.accountId == "peter"
+  customer.type == "level2"
+  customer.favoriateActor == "jet li"
+Then
+  recommend movies with jet li
+    </programlisting>
+    <para>Looking at the example, the first to question ask is "do these kinds of rules apply
+    to all customers?" If it does, the first condition in the rule "customer.accountId" is
+    pointless. It's pointless because all rules of this type will have that condition.
+    Although the accountId changes, the rule can effectively ignore it. If we rewrite the rule
+    this way, the rule can apply to any customer that likes jackie chan and jet li.</para>
+    <programlisting>
+If
+  customer.type == "level2"
+  customer.favoriateActor == "jackie chan"
+Then
+  recommend movies with jackie chan
+
+If
+  customer.type == "level2"
+  customer.favoriateActor == "jet li"
+Then
+  recommend movies with jet li
+    </programlisting>
+    <para>The reason we do this is straight forward. The rules reason over data. Having a
+    ton of rules with the customer's accountId hard coded doesn't do any good, because we
+    want the rule engine to only evaluate the active sessions. We don't want to load all
+    the customers into the rule engine. We can take it a step further and make the rule more
+    general.</para>
+    <programlisting>
+If
+  customer.type == "level2"
+  customer.accountId ?id // bind the account id to a variable
+  favorites.accountId ?id // find the list of favorites by the account id
+Then
+  recommend all items in the favorites
+    </programlisting>
+    <para>With this change, it can reduce the number of rules significantly. This is one
+    reason the RETE approach is often called "data driven approach". Let's take this example
+    a bit further and define 10 types of customers from level1 to level10. Say we run a mega
+    online store and customers can define their favorites in each of the categories (books,
+    videos, music, toys, electronics, clothing). What happens if a customer has different
+    levels for each category. Using the hard coded approach, one might have to add more rules.
+    If we change the rule and make it more generalized, the same rule can handle multiple
+    categories.</para>
+    <programlisting>
+If
+  recommendation.level ?lvl // bind the recommendation level to a variable
+  recommendation.category ?rcat // bind the recommendation category
+  customer.accountId ?id // bind the account id to a variable
+  favorites.accountId ?id // find the list of favorites by the account id
+  favorites.category ?rcat // match favorite to recommendation category
+  favorites.level ?lvl // match the favorite level to recommendation level
+Then
+  recommend all items in the favorites    
+    </programlisting>
+    <para>So what is the cost of making the rule dynamic and data driven? Obviously, hard
+    coding a rule is going to be faster than making it generalized, but the performance delta
+    should be small. In the case where a ruleset is small, the hard coded approach may have a
+    slight performance lead. Why is that? Lets look at 2 different types of rule engines:
+    procedural and RETE.</para>
+    <para>In a procedural engine, one can build a decision tree and end the evaluation once
+    the data fails to satisfy the conditions at a given level. As the rule count increases,
+    there are more rules the engine has to evaluate. In a procedural approach, the rules have
+    to be sequenced in the optimal order to get the best results. The limitation of sorting
+    the rules in optimal sequence is that many cases it's not possible to pre-sort. If we use
+    a RETE rule engine, the hard coded rules result in fewer joins for a small number of rules.
+    As the rule count grows, the single rule will perform better. The equation to estimate the
+    threshold where the generalized form is faster than hard coding the constants.</para>
+    <para>bn = join nodes, lf = left facts, rf = right facts, ae = average number of
+    evaluation descending from the object type node for a random sample, f = facts,
+    hd = hard coded constants in the rules, general = generalized form using joins</para>
+    <para>general( sum( bn(lf * rf) ) + sum(ae * f) ) < hd( sum( bn(lf * rf) ) + sum(ae * f) )</para>
+    <para>The best way to quantify the threshold is to write rules in both formats and run a
+    series of tests. Given that most projects are under tight schedules, developers don't
+    always have time to do this. The other common problem is using really large flat objects.
+    In a nutshell, using large flat objects leads to the same problem as hard coding the
+    constants in the rules. The solution to the problem is to change the domain objects,
+    such that it models the business concepts in a concise manner. That isn't always an
+    option.</para>
+    <para>When most of the rules share the same conditions, there's two solutions. The best
+    solution is to rewrite the rules to use chaining. Identify the common conditions and extract
+    it into a generalized rule. The generalized rule then trigger subsequent rules by asserting
+    a new fact. Often this can reduce the rules by an order of magnitude or more. The second
+    option is to put common conditions at the beginning of the rule. What this does is it
+    allows RETE rule engines to share those nodes. When the nodes are shared, it reduces the
+    cost from a memory and performance perspective.</para>
+    <para>If the ruleset can be divided into smaller chunks, it's a good idea to divide it into
+    discrete stages and load each ruleset on a different JVM or server. Depending on the
+    situation, this may not be an option. So what can you do when the ruleset is large and
+    rewriting the rules isn't an option?</para>
+    <para>The only viable option is to scale the hardware and use a different JVM. This means
+    using 64bit JVM from SUN, IBM or BEA JRockit on a machine with atleast 8Gb RAM. Depending
+    on the ruleset, the system may need more RAM.</para>
+    
+  </section>
 </section>
\ No newline at end of file