[rules-users] Object size affect on session insertion performance

Mark Proctor mproctor at codehaus.org
Sun Feb 23 11:25:50 EST 2014


We need to check where the 4999 are coming from. It makes sense that there are 5000 hashCode calls on object insertion, but unless i’m missing something there should be just one hashCode call for the accumulate node.

Line 692 should be called just once, per left input propagation for the accumulate per fireAllRules call:
https://github.com/droolsjbpm/drools/blob/master/drools-core/src/main/java/org/drools/core/phreak/PhreakAccumulateNode.java#L692

If line 692 is called just once, a profiler will highlight where else hashCode is being called. We can then investigate if it’s necessary, or if it can be avoided.

Mark

On 23 Feb 2014, at 15:03, Wolfgang Laun <wolfgang.laun at gmail.com> wrote:

> Using OPs rule:
> 
> 5.5.0
> 5000 facts, hashCode computed 12507500 times
> 
> 6.0.0
> 5000 facts, hashCode computed 10000 times
> 
> After 5000 insertions and before fireAllRules, there were 5000
> computations of hashCode. After the Engine is started, another 5000
> computations occur.
> This might bear investigation, although O(n) ist much better than O(n^2).
> 
> -W
> 
> 
> On 23/02/2014, Mark Proctor <mproctor at codehaus.org> wrote:
>> Please do report back to the list if 6.0 alleviates this problem, without
>> needing a blocking fact.
>> 
>> Mark
>> On 23 Feb 2014, at 13:30, Mark Proctor <mproctor at codehaus.org> wrote:
>> 
>>> "I have seen a few related posts on what I have found, the #hashCode of
>>> my
>>> ComplexClass is taking nearly all of the time.
>>> Upon further investigation, I found that this is happening because every
>>> insert to the session is causing the `result` portion of the accumulate
>>> to
>>> "recalculate".  "
>>> 
>>> There isn't much we can do about your has code performance time. But if
>>> you upgrade to 6.0, the batch oriented propagation will reduce how often
>>> this is done, and hopefully minimise the difference. You won't then need a
>>> blocking fact.
>>> 
>>> Mark
>>> On 23 Feb 2014, at 03:38, mikerod <mjr4184 at gmail.com> wrote:
>>> 
>>>> @laune
>>>> 
>>>> You are correct that I actually put an incorrect time up before.  Thanks
>>>> for
>>>> pointing that out and sorry for the confusion.
>>>> The behavioral difference I have found was actually much large between
>>>> the 2
>>>> classes, SimpleClass and ComplexClass, than I originally
>>>> thought.
>>>> 
>>>> The SimpleClass accumulate is very quick, around ~300 ms.  The
>>>> ComplexClass
>>>> accumulate (with the exact same rule beyond the object type)
>>>> spikes to around ~140 *seconds*.  In both cases this is with either 5K
>>>> objects of one type inserted.
>>>> 
>>>> @Mark
>>>> 
>>>> I am sure that it is not the object creation time.  I create all of the
>>>> objects before the timers and they are not lazy in any initialization.
>>>> However, you were right that I needed to run some profiling on this to
>>>> dig
>>>> into the real issue.
>>>> 
>>>> To start off, the culprit for this issue is the accumulate.  A rule
>>>> without
>>>> it like:
>>>> ```
>>>> rule "not collecting"
>>>> 	when
>>>> 	ComplexClass() ; swap for SimpleClass on another run
>>>> 	then
>>>> 	System.out.println("Done");
>>>> end
>>>> 
>>>> ```
>>>> runs about the same, no matter if it has a SimpleClass or ComplexClass.
>>>> 
>>>> Also, I'd like to just clarify, a SimpleClass here is just a class with
>>>> 2
>>>> Integer fields (for this example).
>>>> However, the ComplexClass has around 15 fields and about half of these
>>>> are
>>>> Collections (aggregate) types with more nested classes underneath.
>>>> This is the difference I mean between "simple" and "complex" in a class;
>>>> if
>>>> that wasn't clear before.
>>>> 
>>>> Furthermore, there is only a single rule with only this very simple,
>>>> contrived LHS logic in my example.  Drools is not needing
>>>> to traverse any of the objects and no additional work is done.  This is
>>>> purely just a single rule being evaluated during an insert.
>>>> This is Drools v5.5.0.Final in this specific example (sorry for not
>>>> mentioning that before).
>>>> 
>>>> ---
>>>> 
>>>> I have seen a few related posts on what I have found, the #hashCode of
>>>> my
>>>> ComplexClass is taking nearly all of the time.
>>>> Upon further investigation, I found that this is happening because every
>>>> insert to the session is causing the `result` portion of the accumulate
>>>> to
>>>> "recalculate".  During this step, the AccumulateContext `result`
>>>> RightTuple
>>>> is having its FactHandle reset to the newly calculated result.
>>>> This calls the #hashCode of the Collection that is holding all of the
>>>> current ComplexClass object instances; and Collection calls the #hashCode
>>>> of
>>>> each of these (in j.u.Collection impl's such as j.u.AbstractList and
>>>> j.u.AbstractSet).
>>>> 
>>>> So, I have a Collection, that is increasingly growing with ComplexClass
>>>> object instances, and each time it grows by one, the #hashCode of the
>>>> entire
>>>> Collection of ComplexClass objects is being calculated.
>>>> 
>>>> The ComplexClass #hashCode is an aggregate of a recursive walk along
>>>> across
>>>> all of the objects' #hashCode it reaches through its fields, just like
>>>> many
>>>> aggregate types.  I think I can see that this could be expensive if this
>>>> is
>>>> being calculated for nearly 5K objects as each of the final objects are
>>>> inserted causing the `result` recalculation.
>>>> 
>>>> ---
>>>> 
>>>> I do realize that one potential workaround would be to put a blocking
>>>> constraint above the accumulate:
>>>> ```
>>>> rule "collecting"
>>>> 	when
>>>> 	BlockingFactClass()
>>>> 	$res : Collection() from
>>>> 			accumulate( $s : ComplexClass(),
>>>> 				init( Collection c = new ArrayList(); ),
>>>> 				action( c.add($s); ),
>>>> 				result( c ))
>>>> 				
>>>> 	then
>>>> 	System.out.println("Done");
>>>> end
>>>> ```
>>>> where the BlockingFactClass is not inserted until *after* all of the
>>>> ComplexClass objects.  This speeds up the performance significantly; the
>>>> time is nearly the same as the SimpleClass run actually.
>>>> 
>>>> ---
>>>> 
>>>> I found that this was an interesting discovery and I did not expect this
>>>> behavior.
>>>> 
>>>> So @Mark it does seem (to me) that a deeply nested ComplexClass can hurt
>>>> performance on an AccumulateNode when the `result` can be repeatedly
>>>> calculated; even when the `result` is not "doing" anything besides
>>>> returning
>>>> what has been accumulated/collected.  I understand this is probably just
>>>> a
>>>> "gotcha" that I have to deal with.  This behavior is also the same for
>>>> the
>>>> Drools `collect` functionality, which I think just uses accumulate in
>>>> the
>>>> impl anyways (perhaps I'm incorrect).  Also, I note that this isn't
>>>> necessary a direct "Object size affecting session insertion performance",
>>>> as
>>>> I originally titled this thread.
>>>> 
>>>> I also think that the new Phreak-based impl for Drools in v6.x may not
>>>> behave like this anymore, since it is more lazy and delays work more
>>>> until
>>>> firing rules (an assumption here; haven't tested that).
>>>> 
>>>> With that said, I'm open to anymore suggestions about how to avoid this
>>>> issue in pre-Phreak Drools (v6.x)
>>>> (I am not sure how long until I am able to make that jump in version.).
>>>> 
>>>> 
>>>> Also, I'm open to be corrected if my findings are incorrect/incomplete.
>>>> :)
>>>> 
>>>> Thanks again for the feedback!  It is helpful.
>>>> 
>>>> 
>>>> 
>>>> --
>>>> View this message in context:
>>>> http://drools.46999.n3.nabble.com/Object-size-impact-on-session-insertion-performance-tp4028244p4028251.html
>>>> Sent from the Drools: User forum mailing list archive at Nabble.com.
>>>> _______________________________________________
>>>> rules-users mailing list
>>>> rules-users at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/rules-users
>>> 
>> 
>> 
>> _______________________________________________
>> rules-users mailing list
>> rules-users at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/rules-users
>> 
> _______________________________________________
> rules-users mailing list
> rules-users at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/rules-users




More information about the rules-users mailing list