[rules-users] rules-users Digest, Vol 63, Issue 60

Welsh, Armand AWelsh at StateStreet.com
Mon Feb 13 20:57:27 EST 2012


It would almost seem as though the collection is a sort of sub rule that fires for every insert, and generates temporary fact collections based on the results, and so, even though you are inserting all the facts before calling the execute in the stateless session, each and every single fact is being evaluated for the collect sub-rule to build the new collections, and then those new collections are fed into the actual rule as temporary facts unique to that rule.  And that perhaps this is the source of the n-squared processes.  

Have you tried a more literal approach like this? (based on your comment that you are looking for the minimum)
(assuming there is a property called Value that you are looking to get the minimum of)

rule "minimum"
when
	Post($v: value)
	not Post(value <= $v)
then
	System.out.println("Minimum value: " + $v);
end

rule "maximum"
when
	Post($v: value)
	not Post(value >= $v)
then
	System.out.println("Maximum value: " + $v);
end


Of course, this has not direct ration to what you are trying to do, but rather, I am focusing on your comment that you trying to focus on computing the minimum of a large collection of objects.

I wonder if accumulate might do a better job in solving your original problem, since accumulate could allow you to calculate min, max, average, count, sum, return collections as List and Set of data..

Take a look at section 5.8.3.6.4.1 of the Drools Expert manual for details on use, and 5.8.3.6.4.2 for details on creating your own functions which could do all of these in one, which would greatly reduce the number of iterations through your source collection.

-----Original Message-----
From: rules-users-bounces at lists.jboss.org [mailto:rules-users-bounces at lists.jboss.org] On Behalf Of Shur, Bob
Sent: Monday, February 13, 2012 3:06 PM
To: rules-users at lists.jboss.org
Subject: Re: [rules-users] rules-users Digest, Vol 63, Issue 60

I don't understand either of these answers. All of the facts are in a list of facts passed into ksession.execute, like this:

	        final StatelessKnowledgeSession ksession = kbase.newStatelessKnowledgeSession();
	        List<Object> facts = createPosts();  
	        if (facts.size() == 0) break;
	        long t0 = System.currentTimeMillis()/100;
	        ksession.execute(facts);   
	        long t1 = System.currentTimeMillis()/100;
	        System.out.println("Elapsed time: " + (t1 - t0));

There are no more facts getting created during rules processing. The whole drl file is:

rule "collect"
when
	$a : ArrayList(size > 0) from collect(Post());
then
	System.out.println("Number of posts: " + $a.size());
	System.out.println("DONE");
end

Why is the collect happening more than once? Actually I don't think it is, else I would be seeing more than one printout.

I agree that I could probably find a way to do without the list, but I would still like to understand why this is n-squared. I was actually planning to explore the performance of various ways to compute the minimum of a large number of objects. I was surprised to find that I was already up to n-squared just creating the list.

> ------------------------------
> 
> Message: 3
> Date: Mon, 13 Feb 2012 20:59:20 +0100
> From: Wolfgang Laun <wolfgang.laun at gmail.com>
> Subject: Re: [rules-users] Performance of collect seems to be
> 	n-squared
> To: Rules Users List <rules-users at lists.jboss.org>
> Message-ID:
> 	<CANaj1LeAnD-
> kYngQqTzo1XoOh4V8tnqo5A+9CkVR6xPmK4gxMA at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
> 
> Repeatedly creating this ArrayList is bound to be O(n2).
> 
> Why do you want the Post facts as a List? There may be better ways for
> achieving your goal.
> 
> -W
> 
> 
> On 13 February 2012 20:49, Shur, Bob <robert.shur at hp.com> wrote:
> 
> > I have a simple class Post { int id; ... }. I insert some number of
> them
> > as facts and then call ksession.execute(facts);
> >
> > The whole drl file (except for package and import statements) is:
> >
> > rule "collect"
> > when
> >        $a : ArrayList(size > 0) from collect(Post());
> > then
> >        System.out.println("Number of posts: " + $a.size());
> >        System.out.println("DONE");
> > end
> >
> > The time it takes to run is n-squared in the number of posts. For
> 4000,
> > 8000, 16000, 32000, 64000 posts the time is (in 1/10 seconds):
> >
> > 4, 14, 56, 220, 852
> >
> > Is this expected? Is there a better way for me to do the collect?
> >
> >
> > _______________________________________________
> > rules-users mailing list
> > rules-users at lists.jboss.org
> > https://lists.jboss.org/mailman/listinfo/rules-users
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: http://lists.jboss.org/pipermail/rules-
> users/attachments/20120213/323161b1/attachment-0001.html
> 
> ------------------------------
> 
> Message: 4
> Date: Mon, 13 Feb 2012 15:12:12 -0500
> From: Edson Tirelli <ed.tirelli at gmail.com>
> Subject: Re: [rules-users] Performance of collect seems to be
> 	n-squared
> To: Rules Users List <rules-users at lists.jboss.org>
> Message-ID:
> 	<CAD7AJnf9mZcm77o3uf+EbigEVZzpgo5UJkKeyR0M0r1f3io1gw at mail.gmail.c
> om>
> Content-Type: text/plain; charset="iso-8859-1"
> 
>    Just FYI, the array list is created once and reused with incremental
> addition/removal of elements.
> 
>    The problem is that the engine does not know when you are done
> inserting
> all the Post facts, so each post inserted is propagated down the
> network
> with the cancellation and reactivation of the rule.
> 
>    This is one of those situations where a use of a control fact might
> be
> recommended, if you are inserting Post facts in batches. E.g.:
> 
> rule X
> when
>      exists( AllPostsInserted() )
>      $a : ArrayList(size > 0) from collect(Post());
> then
> ...
> end
> 
>    Edson
> 
> 
> 2012/2/13 Wolfgang Laun <wolfgang.laun at gmail.com>
> 
> > Repeatedly creating this ArrayList is bound to be O(n2).
> >
> > Why do you want the Post facts as a List? There may be better ways
> for
> > achieving your goal.
> >
> > -W
> >
> >
> >
> > On 13 February 2012 20:49, Shur, Bob <robert.shur at hp.com> wrote:
> >
> >> I have a simple class Post { int id; ... }. I insert some number of
> them
> >> as facts and then call ksession.execute(facts);
> >>
> >> The whole drl file (except for package and import statements) is:
> >>
> >> rule "collect"
> >> when
> >>        $a : ArrayList(size > 0) from collect(Post());
> >> then
> >>        System.out.println("Number of posts: " + $a.size());
> >>        System.out.println("DONE");
> >> end
> >>
> >> The time it takes to run is n-squared in the number of posts. For
> 4000,
> >> 8000, 16000, 32000, 64000 posts the time is (in 1/10 seconds):
> >>
> >> 4, 14, 56, 220, 852
> >>
> >> Is this expected? Is there a better way for me to do the collect?
> >>
> >>
**************************

_______________________________________________
rules-users mailing list
rules-users at lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users




More information about the rules-users mailing list