If your 200K rules are all of the same few patterns (1, 2 and 3 attribute equalities)
then expanding this into all combinations may not be the way to go.

Can you provide a little more detail about whether the 1/2/3 attributes are
always from the same fact type, and how many of these 1-2-3-combinations
there are?

Also, combinations of rules and (hash) lookup tables might reduce the
number of rules by several orders of magnitude.

-W


2009/9/3 Adam Sussman <adam.sussman@ticketmaster.com>

I am hoping that I am doing something wrong here and that one of you can
point me in the right direction.

Can anyone provide some advice on scaling up the number of rules in a
single KnowledgeBase?  While I have seen all sorts of reports on having
lots of facts, I have not seen anything about having lots of rules.

I need to get to about 200K rules in a single KnowledgeBase, and also
to run several of these side by side in the same system.

The problem:

As the number of rules increases, the time to compile and load them
into memory skyrockets.  Now while I realize that the Rete algorithm
complexity is about the number of rules, the times I am seeing are
pretty scary.  Also, at about 30k rules, things just fall apart.

On a 64 bit Linux OS with 2 2.4Ghz processors using a 64bit JVM from
OpenJDK (1.6.1) with 1 Gig of memory allocated to the JVM, loading
from .drl files:

1000 rules:
    KnowledgeBuilder.add:                   7 seconds
    KnowledgeBase.addKnowledgePackages:     .8 seconds

10000 rules:
    KnowledgeBuilder.add:                   79 seconds
    KnowledgeBase.addKnowledgePackages:     23 seconds

15000 rules:
    KnowledgeBuilder.add:                   138 seconds
    KnowledgeBase.addKnowledgePackages:      55 seconds

20000 rules:
    KnowledgeBuilder.add:                   488 seconds
    KnowledgeBase.addKnowledgePackages:     100 seconds

30000 rules:
    KnowledgeBuilder.add:                   out of memory
    KnowledgeBase.addKnowledgePackages:     never runs

At this rate, 200k rules will take 13-14 hours to compile
and 2-3 hours to load into RAM, assuming I can even get
to that many rules.  This just is not usable.

Time to fire all rules is negligible (fortunately!).

The rules I am testing on are very simple 1-3 variable equality whens
with a simple System.out.println then clause.

The benchmark code I am running is as follows:

    KnowledgeBase kbase = KnowledgeBaseFactory.newKnowledgeBase();
    KnowledgeBuilder kbuilder = KnowledgeBuilderFactory.newKnowledgeBuilder();

    kbuilder.add( ResourceFactory.newClassPathResource( drlFile, RuleRunner.class ), ResourceType.DRL );

    Collection<KnowledgePackage> pkgs = kbuilder.getKnowledgePackages();
    kbase.addKnowledgePackages( pkgs );


Sample rule:

    rule "00000005 - random rule"
            when
                Transaction(someId == 35156 && someOtherId == '79F81FB8134A129F' && someCollection contains 'EC3F2A1DCA88')
            then
                System.out.println("match rule 00000005 - random rule");
    end

Any help would be appreciated.

Regards,

Adam Sussman

CONFIDENTIALITY NOTICE:

This message contains information which may be confidential or privileged.  If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of the contents of this information is prohibited.  If you have received this transmission in error, please notify me immediately by telephone.

_______________________________________________
rules-users mailing list
rules-users@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users