@Mark: Note that I've been counting the FactClass::hashCode calls
only, not hashCode calls of all Objects. So you're hunting 5000 calls
but that extra one
won't spoil your day ;-)
-W
On 23/02/2014, Mark Proctor <mproctor(a)codehaus.org> wrote:
We need to check where the 4999 are coming from. It makes sense that
there
are 5000 hashCode calls on object insertion, but unless i'm missing
something there should be just one hashCode call for the accumulate node.
Line 692 should be called just once, per left input propagation for the
accumulate per fireAllRules call:
https://github.com/droolsjbpm/drools/blob/master/drools-core/src/main/jav...
If line 692 is called just once, a profiler will highlight where else
hashCode is being called. We can then investigate if it's necessary, or if
it can be avoided.
Mark
On 23 Feb 2014, at 15:03, Wolfgang Laun <wolfgang.laun(a)gmail.com> wrote:
> Using OPs rule:
>
> 5.5.0
> 5000 facts, hashCode computed 12507500 times
>
> 6.0.0
> 5000 facts, hashCode computed 10000 times
>
> After 5000 insertions and before fireAllRules, there were 5000
> computations of hashCode. After the Engine is started, another 5000
> computations occur.
> This might bear investigation, although O(n) ist much better than O(n^2).
>
> -W
>
>
> On 23/02/2014, Mark Proctor <mproctor(a)codehaus.org> wrote:
>> Please do report back to the list if 6.0 alleviates this problem,
>> without
>> needing a blocking fact.
>>
>> Mark
>> On 23 Feb 2014, at 13:30, Mark Proctor <mproctor(a)codehaus.org> wrote:
>>
>>> "I have seen a few related posts on what I have found, the #hashCode of
>>> my
>>> ComplexClass is taking nearly all of the time.
>>> Upon further investigation, I found that this is happening because
>>> every
>>> insert to the session is causing the `result` portion of the accumulate
>>> to
>>> "recalculate". "
>>>
>>> There isn't much we can do about your has code performance time. But if
>>> you upgrade to 6.0, the batch oriented propagation will reduce how
>>> often
>>> this is done, and hopefully minimise the difference. You won't then need
>>> a
>>> blocking fact.
>>>
>>> Mark
>>> On 23 Feb 2014, at 03:38, mikerod <mjr4184(a)gmail.com> wrote:
>>>
>>>> @laune
>>>>
>>>> You are correct that I actually put an incorrect time up before.
>>>> Thanks
>>>> for
>>>> pointing that out and sorry for the confusion.
>>>> The behavioral difference I have found was actually much large between
>>>> the 2
>>>> classes, SimpleClass and ComplexClass, than I originally
>>>> thought.
>>>>
>>>> The SimpleClass accumulate is very quick, around ~300 ms. The
>>>> ComplexClass
>>>> accumulate (with the exact same rule beyond the object type)
>>>> spikes to around ~140 *seconds*. In both cases this is with either 5K
>>>> objects of one type inserted.
>>>>
>>>> @Mark
>>>>
>>>> I am sure that it is not the object creation time. I create all of
>>>> the
>>>> objects before the timers and they are not lazy in any initialization.
>>>> However, you were right that I needed to run some profiling on this to
>>>> dig
>>>> into the real issue.
>>>>
>>>> To start off, the culprit for this issue is the accumulate. A rule
>>>> without
>>>> it like:
>>>> ```
>>>> rule "not collecting"
>>>> when
>>>> ComplexClass() ; swap for SimpleClass on another run
>>>> then
>>>> System.out.println("Done");
>>>> end
>>>>
>>>> ```
>>>> runs about the same, no matter if it has a SimpleClass or
>>>> ComplexClass.
>>>>
>>>> Also, I'd like to just clarify, a SimpleClass here is just a class
>>>> with
>>>> 2
>>>> Integer fields (for this example).
>>>> However, the ComplexClass has around 15 fields and about half of these
>>>> are
>>>> Collections (aggregate) types with more nested classes underneath.
>>>> This is the difference I mean between "simple" and
"complex" in a
>>>> class;
>>>> if
>>>> that wasn't clear before.
>>>>
>>>> Furthermore, there is only a single rule with only this very simple,
>>>> contrived LHS logic in my example. Drools is not needing
>>>> to traverse any of the objects and no additional work is done. This
>>>> is
>>>> purely just a single rule being evaluated during an insert.
>>>> This is Drools v5.5.0.Final in this specific example (sorry for not
>>>> mentioning that before).
>>>>
>>>> ---
>>>>
>>>> I have seen a few related posts on what I have found, the #hashCode of
>>>> my
>>>> ComplexClass is taking nearly all of the time.
>>>> Upon further investigation, I found that this is happening because
>>>> every
>>>> insert to the session is causing the `result` portion of the
>>>> accumulate
>>>> to
>>>> "recalculate". During this step, the AccumulateContext
`result`
>>>> RightTuple
>>>> is having its FactHandle reset to the newly calculated result.
>>>> This calls the #hashCode of the Collection that is holding all of the
>>>> current ComplexClass object instances; and Collection calls the
>>>> #hashCode
>>>> of
>>>> each of these (in j.u.Collection impl's such as j.u.AbstractList and
>>>> j.u.AbstractSet).
>>>>
>>>> So, I have a Collection, that is increasingly growing with
>>>> ComplexClass
>>>> object instances, and each time it grows by one, the #hashCode of the
>>>> entire
>>>> Collection of ComplexClass objects is being calculated.
>>>>
>>>> The ComplexClass #hashCode is an aggregate of a recursive walk along
>>>> across
>>>> all of the objects' #hashCode it reaches through its fields, just
like
>>>> many
>>>> aggregate types. I think I can see that this could be expensive if
>>>> this
>>>> is
>>>> being calculated for nearly 5K objects as each of the final objects
>>>> are
>>>> inserted causing the `result` recalculation.
>>>>
>>>> ---
>>>>
>>>> I do realize that one potential workaround would be to put a blocking
>>>> constraint above the accumulate:
>>>> ```
>>>> rule "collecting"
>>>> when
>>>> BlockingFactClass()
>>>> $res : Collection() from
>>>> accumulate( $s : ComplexClass(),
>>>> init( Collection c = new ArrayList(); ),
>>>> action( c.add($s); ),
>>>> result( c ))
>>>>
>>>> then
>>>> System.out.println("Done");
>>>> end
>>>> ```
>>>> where the BlockingFactClass is not inserted until *after* all of the
>>>> ComplexClass objects. This speeds up the performance significantly;
>>>> the
>>>> time is nearly the same as the SimpleClass run actually.
>>>>
>>>> ---
>>>>
>>>> I found that this was an interesting discovery and I did not expect
>>>> this
>>>> behavior.
>>>>
>>>> So @Mark it does seem (to me) that a deeply nested ComplexClass can
>>>> hurt
>>>> performance on an AccumulateNode when the `result` can be repeatedly
>>>> calculated; even when the `result` is not "doing" anything
besides
>>>> returning
>>>> what has been accumulated/collected. I understand this is probably
>>>> just
>>>> a
>>>> "gotcha" that I have to deal with. This behavior is also the
same for
>>>> the
>>>> Drools `collect` functionality, which I think just uses accumulate in
>>>> the
>>>> impl anyways (perhaps I'm incorrect). Also, I note that this
isn't
>>>> necessary a direct "Object size affecting session insertion
>>>> performance",
>>>> as
>>>> I originally titled this thread.
>>>>
>>>> I also think that the new Phreak-based impl for Drools in v6.x may not
>>>> behave like this anymore, since it is more lazy and delays work more
>>>> until
>>>> firing rules (an assumption here; haven't tested that).
>>>>
>>>> With that said, I'm open to anymore suggestions about how to avoid
>>>> this
>>>> issue in pre-Phreak Drools (v6.x)
>>>> (I am not sure how long until I am able to make that jump in
>>>> version.).
>>>>
>>>>
>>>> Also, I'm open to be corrected if my findings are
>>>> incorrect/incomplete.
>>>> :)
>>>>
>>>> Thanks again for the feedback! It is helpful.
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>>
http://drools.46999.n3.nabble.com/Object-size-impact-on-session-insertion...
>>>> Sent from the Drools: User forum mailing list archive at
Nabble.com.
>>>> _______________________________________________
>>>> rules-users mailing list
>>>> rules-users(a)lists.jboss.org
>>>>
https://lists.jboss.org/mailman/listinfo/rules-users
>>>
>>
>>
>> _______________________________________________
>> rules-users mailing list
>> rules-users(a)lists.jboss.org
>>
https://lists.jboss.org/mailman/listinfo/rules-users
>>
> _______________________________________________
> rules-users mailing list
> rules-users(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/rules-users
_______________________________________________
rules-users mailing list
rules-users(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users