[rules-users] optimization on a lot of simple rules

nesta nesta.fdb at 163.com
Mon Jul 20 21:53:28 EDT 2009


Sorry for my confusing question. Let me clarify it again.

In our product, as the time goes on, there will be more and more products,
if there is a new product, we add a new rule, finally there will be a lot of
rules. So I want to test the performance when there are a lot of rules.
Performance requirement is tough.

Currently I realize that rules and data models are very important in rule
engine. If we do the object model with OO, there will be more classes such
as Product, Subscriber, Service and Decision. With so many class object, if
we define rule's when like "Product()" , "Decision()", and Subscriber() and
so on, there will be "cartesian join".

If I define all attributes in one class, there will be no issue such as
WholeFact. Is it a good practice?

In our current system, I am sure there is only one Subscriber, Decision and
Product instance.  These facts are inserted into working memory to match
with a lot of rules. 

My test code is as follows, that shows that there is performance difference.

If I define the rule like:
rule x
when
    Decision()
    Subscriber()
    Product()
then
end

Does it mean that object model or rule is not good enough?


I use drools template to generate more rules, the template is as follows:

template header
id_field
id_value
usage_field
usage_value
service_id_field
service_id_value
subscriber_id_field
subscriber_id_value

package product;

import ttemplate.Product;
import ttemplate.Service;
import ttemplate.Subscriber;

global ttemplate.RatingDecision decision;

template product

rule "product_@{row.rowNumber}"
	when
		Product(@{id_field} == @{id_value}, @{usage_field} == @{usage_value})
		Service(@{service_id_field} == @{service_id_value} )
		Subscriber(@{subscriber_id_field} == @{subscriber_id_value} )	
	then
		decision.setPrice(@{row.rowNumber});
end

end template

Test Codes:
It will test 10, 200 and 500 rules with "cartesian join" as you mentioned.

public class TemplateExecutor {
	
	private String drl;
	
	private void expander(int ruleCount) {
		Collection<ParamSet> params = new ArrayList<ParamSet>();
		for (int i = 0; i < ruleCount; i++) {
			ParamSet ps = new ParamSet();
			ps.setId_field("id");
			ps.setId_value(i);
			ps.setUsage_field("usage");
			ps.setUsage_value(i);
			ps.setService_id_field("id");
			ps.setService_id_value(i % 2);
			ps.setSubscriber_id_field("id");
			ps.setSubscriber_id_value(i % 2);
			params.add(ps);
		}
		InputStream is =
TemplateExecutor.class.getResourceAsStream("/template.drl.tpl");
		ObjectDataCompiler compiler = new ObjectDataCompiler();
		this.drl = compiler.compile(params, is);
//		System.out.println(this.drl);
	}
	
	public void doExecute(int ruleCount, int runTimes) {
		this.expander(ruleCount);
		
		KnowledgeBuilder builder = KnowledgeBuilderFactory.newKnowledgeBuilder();
		StringReader reader = new StringReader(this.drl);
		builder.add(ResourceFactory.newReaderResource(reader), ResourceType.DRL);
		if (builder.hasErrors()) {
			System.out.println(builder.getErrors());
		}
		KnowledgeBaseConfiguration kbc =
KnowledgeBaseFactory.newKnowledgeBaseConfiguration();
		KnowledgeBase kbase = KnowledgeBaseFactory.newKnowledgeBase(kbc);
		kbase.addKnowledgePackages(builder.getKnowledgePackages());
		
		long start = System.currentTimeMillis();
		long ssall = 0;
		long sdall = 0;
		long rall = 0;
		
			long sstart = System.currentTimeMillis();
			StatefulKnowledgeSession session = kbase.newStatefulKnowledgeSession();
			ssall += (System.currentTimeMillis() - sstart);
		
		for (int i = 0; i < runTimes; i++) {
			List<FactHandle> handles = new ArrayList<FactHandle>();
		
			
			long rstart = System.currentTimeMillis();
			Product product = new Product(1, 1);
			Service service = new Service(1);
			Subscriber subscriber = new Subscriber(1);
//			WholeFact wfact = new WholeFact();
//			wfact.setProduct_id(1);
//			wfact.setUsage(1);
//			wfact.setService_id(1);
//			wfact.setSubscriber_id(1);
			RatingDecision rd = new RatingDecision();
//			handles.add(session.insert(rd));
			session.setGlobal("decision", rd);
			handles.add(session.insert(product));
			handles.add(session.insert(service));
			handles.add(session.insert(subscriber));
//			handles.add(session.insert(wfact));
			
			session.fireAllRules();
			if (rd.getPrice() != 1) {
				throw new RuntimeException("wrong result");
			}
			rall += (System.currentTimeMillis() - rstart);
			
			for (FactHandle fh : handles) {
				session.retract(fh);
			}
		}
		
			long sdstart = System.currentTimeMillis();
			session.dispose();
			sdall += (System.currentTimeMillis() - sdstart);
		
		System.out.println("rule count -> " + ruleCount 
				+ " elapse -> " + (System.currentTimeMillis() - start) 
				+ " create session -> " + ssall
				+ " dispose session -> " + sdall
				+ " rule fire -> " + rall);
		
	}
	
	/**
	 * @param args
	 */
	public static void main(String[] args) {
		new TemplateExecutor().doExecute(10, 1000);
		System.gc();
		System.out.println("--------- start ----------");
		new TemplateExecutor().doExecute(200, 1000);
		System.gc();
		new TemplateExecutor().doExecute(10, 1000);
		System.out.println("--------------------------");
		System.gc();
		new TemplateExecutor().doExecute(500, 1000);
	}

}




Greg Barton wrote:
> 
> 
> 1) Yes, if you eliminate joins in rules, you will have no joins in the
> rete.  This is self evident.
> 
> 2) The way you have the rules structured, there is no relationship between
> the joined objects.  This will cause what's called a "cartesian join"
> where all combinations of all instances of each object type are
> instantiated.  This can be very expensive, memory and CPU wise.  You've
> stated that there are only one instance of each object type in working
> memory, but are you absolutely sure of that?  Cartesian joins can easily
> cause performance problems quite quickly.
> 
> For instance, say you've got these objects in working memory:
> 
> Subscriber(gender="male")
> Subscriber(gender="female")
> Service(name="ftp")
> Service(name="http")
> Product(id=1)
> Product(id=2)
> Product(id=3)
> 
> After inserting a Decision into working memory, the rule will fire 2*2*3
> times.  (#Subscribers * #Services * #Products) This is by design.  Is this
> what you want?
> 
> 3) Do you really need the 'Subscriber(gender == "male" or "female")' term? 
> Why not just 'Subscriber()'?  Are you classifying transgendered or
> nonhuman subscribers in your system?
> 
> --- On Mon, 7/20/09, nesta <nesta.fdb at 163.com> wrote:
> 
>> From: nesta <nesta.fdb at 163.com>
>> Subject: Re: [rules-users] optimization on a lot of simple rules
>> To: rules-users at lists.jboss.org
>> Date: Monday, July 20, 2009, 10:22 AM
>> 
>> I want to test the matching performance of drools. As I
>> mentioned that there
>> are a lot of rules and  the rule is like:
>> rule 1
>>     when
>>           Decision()
>>          Subscriber(gender ==
>> "male" or "female")
>>          Service(name ==
>> "ftp" or "http")
>>          Product(id == 1)
>>           ......
>>     then
>> end
>> 
>> After test, more condition elements under when, more time
>> needs to execute
>> the test.
>> for example
>> Location ( location == "home" or "office")
>> and so on.
>> So I worry about the matching performance with drools.
>> 
>> I found that a lot of JoinNodes would be executed in
>> runtime. I mean if
>> there is 1000 rules, there will be a lot of JoinNodes
>> (There are at least
>> 1000 JoinNodes between Decision and Product ). And it
>> exactly affects the
>> execution performance.
>> 
>> As you know, Decision, Product, Servcie and so on are plan
>> Java classes. If
>> I define all of attributes of above classes in one class
>> named WholeFact,
>> only one Java Type, there is no mentioned issue.
>> 
>> With WholeFact class, the rule will be changed as follows:
>> rule 1
>>     when
>>           WholeFact(
>> subscriberGender == "male" or "female",
>>                
>>               serviceName
>> == "ftp" or "http",
>>                
>>            
>>    productId == 1 or 2 or 3 ...
>>           )
>>     then
>> end
>> 
>> 
>> Greg Barton wrote:
>> > 
>> > 
>> > Now this finally rises to something that needs rules.
>> :)  In all of the
>> > previous examples you've given you could just have a
>> > Map<ProductKey,Handler> where the Handler looks
>> like this:
>> > 
>> > interface Handler {
>> >   void handle(Product product, Decision
>> decision);
>> > }
>> > 
>> > ...and the ProductKey consists of properties that
>> uniquely identify how
>> > the Product is handled.  So, on it's own, that
>> functionality did not
>> > require rules.
>> > 
>> > However, now that you've introduced more complex
>> decisions, with varying
>> > data, to affect the Decision for each Property type,
>> rules are more
>> > appropriate.
>> > 
>> > Is there any reason why you only have one of each
>> object type in memory at
>> > one time?  Maybe if you state more of the problem
>> requirements we can help
>> > you better.
>> > 
>> > --- On Mon, 7/20/09, nesta <nesta.fdb at 163.com>
>> wrote:
>> > 
>> >> From: nesta <nesta.fdb at 163.com>
>> >> Subject: Re: [rules-users] optimization on a lot
>> of simple rules
>> >> To: rules-users at lists.jboss.org
>> >> Date: Monday, July 20, 2009, 4:14 AM
>> >> 
>> >> Thanks very much.
>> >> But if for every rule, there is one algorithm or
>> discount
>> >> which means that
>> >> result has nothing related with Product's id and
>> usage. I
>> >> can't merge all
>> >> rules in one rule. At the same time, besides
>> Product and
>> >> Decision fact type,
>> >> there are more fact types.
>> >> For example:
>> >> rule 1
>> >>     when
>> >>          Decision()
>> >>          Subscriber(gender ==
>> >> "male" or "female")
>> >>          Service(name ==
>> >> "ftp" or "http")
>> >>          Product(id == 1)
>> >>          ......
>> >>    then
>> >>          ......
>> >> end
>> >> rule 2
>> >>     when
>> >>          Decision()
>> >>          Subscriber(gender ==
>> >> "male" or "female")
>> >>          Service(name ==
>> >> "ftp" or "http")
>> >>          Product(id == 2)
>> >>          ......
>> >>    then
>> >>          ......
>> >> end
>> >> 
>> >> .....
>> >> .....
>> >> 
>> >> In this scenario, if there are 1000 rules,  there
>> will
>> >> be a lot of JoinNode.
>> >> But in runtime, there is only one Decision
>> instance, one
>> >> Subscriber instance
>> >> and Service instance.
>> >> 
>> >> If I define all data in one fact type, I think
>> that there
>> >> are not a lot of
>> >> JoinNodes.
>> >> 
>> >> Is there any other method?
>> >> 
>> >> 
>> >> 
>> >> Wolfgang Laun-2 wrote:
>> >> > 
>> >> > Well, what is the realtion between id, usage
>> and the
>> >> result that's to be
>> >> > stored in a Decision or a global?
>> >> > 
>> >> > Typically, such rules could be written as
>> >> > 
>> >> > rule x
>> >> > no-loop true
>> >> > when
>> >> >     $d : Decision()
>> >> >     $p :Product( id == 1, $usage :
>> >> usage )
>> >> > then
>> >> >     compute/store value, depending
>> >> on the formula for id == 1 (using
>> >> > usage)
>> >> > end
>> >> > // similar rule for id == 2,3,...
>> >> > 
>> >> > If value is a straightforward function of id
>> (and
>> >> usage), then implement a
>> >> > function compValue and use a single rule,
>> e.g.:
>> >> > 
>> >> > rule x
>> >> > no-loop true
>> >> > when
>> >> >     $d : Decision()
>> >> >     Product( $id : id, $usage :
>> >> usage)
>> >> > then
>> >> >    modify $d value to compValue( $id,
>> $usage
>> >> )
>> >> > 
>> >> > Distinguishing all individual combinations of
>> id and
>> >> usage on the LHS
>> >> > seems
>> >> > excessive.
>> >> > 
>> >> > The ordering of CEs also affects execution
>> times.
>> >> > 
>> >> > -W
>> >> > 
>> >> > On 7/20/09, nesta <nesta.fdb at 163.com>
>> >> wrote:
>> >> >>
>> >> >>
>> >> >> In this scenario, there are 1000
>> products,
>> >> different product has
>> >> >> different
>> >> >> price, besides this, the price is
>> affected by
>> >> usage. I want to use
>> >> >> Product.id to match the rules.
>> >> >>
>> >> >> As you mentioned "The crude duplication
>> of rules
>> >> where only the constant
>> >> >> to
>> >> >> be matched with
>> >> >> Product.id varies can, most likely, be
>> avoided."
>> >> >>
>> >> >> How to avoid it in this scenario?
>> >> >>
>> >> >>
>> >> >> Wolfgang Laun-2 wrote:
>> >> >> >
>> >> >> > It's difficult to suggest an
>> optimized form
>> >> for your rules 1-infinity,
>> >> >> > since
>> >> >> > we do not know what you want to
>> achieve.
>> >> >> >
>> >> >> > The crude duplication of rules where
>> only the
>> >> constant to be matched
>> >> >> with
>> >> >> > Product.id varies can, most likely,
>> be
>> >> avoided.
>> >> >> >
>> >> >> > -W
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > On Sun, Jul 19, 2009 at 3:15 PM,
>> nesta <nesta.fdb at 163.com>
>> >> wrote:
>> >> >> >
>> >> >> >>
>> >> >> >> Hi,
>> >> >> >>
>> >> >> >> I am a newbie in drools. There
>> are a lot
>> >> of simple rules in a
>> >> >> scenario.
>> >> >> >> For example
>> >> >> >> rule 1
>> >> >> >>    when
>> >> >> >>        Product( id
>> >> ==1, usage == 1)
>> >> >> >>        $decision :
>> >> Decision()
>> >> >> >>    then
>> >> >> >>     
>> >>    $decision.setValue(1);
>> >> >> >> end
>> >> >> >>
>> >> >> >> rule 2
>> >> >> >> when Product( id ==2, usage ==
>> 1)
>> >> >> >>  $decision : Decision()
>> >> >> >> rule 3
>> >> >> >> when Product( id ==3, usage ==
>> 1)
>> >> >> >>  $decision : Decision()
>> >> >> >> rule 4
>> >> >> >> when Product( id ==4, usage ==
>> 1)
>> >> >> >>  $decision : Decision()
>> >> >> >> rule 5
>> >> >> >> when Product( id ==5, usage ==
>> 1)
>> >> >> >>  $decision : Decision()
>> >> >> >> ......
>> >> >> >>
>> >> >> >> I have a Product fact whose id =
>> 5 and
>> >> usage = 1, in my first
>> >> >> thinking,
>> >> >> >> only
>> >> >> >> rule 5 is matched, there should
>> be not
>> >> much more different between 1
>> >> >> rule
>> >> >> >> and a lot of rules in runtime.
>> >> >> >>
>> >> >> >> But the result shows that they
>> are
>> >> different. More rules will cost
>> >> >> more
>> >> >> >> time. If there are 1 thousand
>> rules, some
>> >> Node and Sink will execute 1
>> >> >> >> thousand times.
>> >> >> >>
>> >> >> >> My question is how to optimize
>> this
>> >> scenario?
>> >> >> >> --
>> >> >> >> View this message in context:
>> >> >> >>
>> >> >>
>> >>
>> http://www.nabble.com/optimization-on-a-lot-of-simple-rules-tp24556724p24556724.html
>> >> >> >> Sent from the drools - user
>> mailing list
>> >> archive at Nabble.com.
>> >> >> >>
>> >> >> >>
>> >> _______________________________________________
>> >> >> >> rules-users mailing list
>> >> >> >> rules-users at lists.jboss.org
>> >> >> >> https://lists.jboss.org/mailman/listinfo/rules-users
>> >> >> >>
>> >> >> >
>> >> >> >
>> >> _______________________________________________
>> >> >> > rules-users mailing list
>> >> >> > rules-users at lists.jboss.org
>> >> >> > https://lists.jboss.org/mailman/listinfo/rules-users
>> >> >> >
>> >> >> >
>> >> >>
>> >> >> --
>> >> >> View this message in context:
>> >> >>
>> >>
>> http://www.nabble.com/optimization-on-a-lot-of-simple-rules-tp24556724p24563725.html
>> >> >> Sent from the drools - user mailing list
>> archive
>> >> at Nabble.com.
>> >> >>
>> >> >>
>> _______________________________________________
>> >> >> rules-users mailing list
>> >> >> rules-users at lists.jboss.org
>> >> >> https://lists.jboss.org/mailman/listinfo/rules-users
>> >> >>
>> >> > 
>> >> >
>> _______________________________________________
>> >> > rules-users mailing list
>> >> > rules-users at lists.jboss.org
>> >> > https://lists.jboss.org/mailman/listinfo/rules-users
>> >> > 
>> >> > 
>> >> 
>> >> -- 
>> >> View this message in context:
>> >>
>> http://www.nabble.com/optimization-on-a-lot-of-simple-rules-tp24556724p24566350.html
>> >> Sent from the drools - user mailing list archive
>> at
>> >> Nabble.com.
>> >> 
>> >> _______________________________________________
>> >> rules-users mailing list
>> >> rules-users at lists.jboss.org
>> >> https://lists.jboss.org/mailman/listinfo/rules-users
>> >> 
>> > 
>> > 
>> >       
>> > 
>> > _______________________________________________
>> > rules-users mailing list
>> > rules-users at lists.jboss.org
>> > https://lists.jboss.org/mailman/listinfo/rules-users
>> > 
>> > 
>> 
>> -- 
>> View this message in context:
>> http://www.nabble.com/optimization-on-a-lot-of-simple-rules-tp24556724p24571875.html
>> Sent from the drools - user mailing list archive at
>> Nabble.com.
>> 
>> 
>> _______________________________________________
>> rules-users mailing list
>> rules-users at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/rules-users
>> 
> 
> 
>       
> 
> _______________________________________________
> rules-users mailing list
> rules-users at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/rules-users
> 
> 

-- 
View this message in context: http://www.nabble.com/optimization-on-a-lot-of-simple-rules-tp24556724p24580497.html
Sent from the drools - user mailing list archive at Nabble.com.





More information about the rules-users mailing list