[rules-users] Improving Drools Memory Performance

Wed Jul 21 23:05:48 EDT 2010

HI Mark,

    IEventTrigger ( text.onInput == onInput ) - is the text here the bound
> variable text, or a field on EventTrigger? the logic isn't very clear. What
> you have written in java would look like
>     eventTrigger.getText().getOnInput().equals( eventTrigger)
>
>     Is that what you where expecting? that's why we often use the $ prefix
> to differentiate fields from variables.
>

EventTrigger doesn't have a field "text", so I think it uses the bound
variable instead and actually written as:

text.getOnInput().equals( eventTrigger );

I use variables very sparingly in my rules, but I didn't realise that using
'$' as a prefix was optional. So using '$onInput' rather than 'onInput'
would make no difference, except for readability?

Jevon

2010/7/22 Mark Proctor <mproctor at codehaus.org>

>  On 22/07/2010 03:28, Jevon Wright wrote:
>
> Hi Mark and Wolfgang,
>
> Thank you for your replies! Comments below.
>
> A bit of background: I am using Drools to take a given EMF model instance,
> and insert new EObjects into the instance, according to the given rules. I
> try to perform inference top-down, so there is more than one iteration of
> insertion - as objects are inserted, the rules need to be re-evaluated. If I
> understand correctly, this means that I can't use a stateless session or the
> sequential option, because the working memory is changing with inserted
> facts.
>
> The rules don't appear to insert directly, because I insert new objects
> into a queue instead (queue.add(object, drools)) - once rule evaluation is
> complete, I insert the contents of the queue into the existing working
> memory and fire all the rules again. I try to prevent the rules modifying
> the working memory directly. This is also why all the rules are of the
> format (x, ..., y, not z => insert z).
>
> This approach has a number of benefits. It finds inconsistencies in the
> rules and means rules have no order, because inserted facts don't effect the
> working memory immediately. It also allows me to detect infinite loops,
> without restricting the number of times a rule can fire. This was described
> in our 2010 paper [1].
>
> I don't think my implementation of this approach is causing the memory
> problem, but I could be wrong.
>
>  detail : DetailWire ( (from == source && to == target) || (from == target
>> && to == source) )
>> The above is turned effectively into an MVEL statement, you might get
>> better performance with a ConditionalElement 'or' as lont as the
>> two are mutually exclusive:
>>
>>  ( DetailWire (from == source, to == target ) or
>>    DetailWire (from == target, to == source) )
>>
>
> I thought this was the case. However in this case, you can't bind the
> variable "detail" (the Drools compiler won't accept the syntax), is this
> correct? I think one solution is to split the rule into two separate rules
> for each "or" part (thus a DSL) - I don't want to have to expand these rules
> by hand.
>
> ( $d : DetailWire (from == source, to == target ) or
>    $d : DetailWire (from == target, to == source) )
>
> is valid
>
>
> And then i'm not sure what it is you are doing in the second two rules, but
>> it looks wrong.
>> text : InputTextField ( eContainer == form, eval
>> (functions.getAutocompleteInputName(attribute).equals(name)) )
>> onInput : EventTrigger ( text.onInput == onInput )
>> currentInput : Property ( text.currentInput == currentInput )
>>
>
> The point of this rule is to select something like the following (from an
> EMF instance):
>
> <child name="form">
>   <child xsi:type="InputTextField" name="...">
>     <onInput xsi:type="EventTrigger" ... />
>     <currentInput xsi:type="Property" ... />
>     <events xsi:type="EventTrigger" ... />
>     <properties xsi:type="Property" ... />
>   </child>
> </child>
>
> IEventTrigger ( text.onInput == onInput ) - is the text here the bound
> variable text, or a field on EventTrigger? the logic isn't very clear. What
> you have written in java would look like
> eventTrigger.getText().getOnInput().equals( eventTrigger)
>
> Is that what you where expecting? that's why we often use the $ prefix to
> differentiate fields from variables.
>
>
> I can't use use 'eContainer', because 'text' can also contain EventTriggers
> in 'text.events'. These bound variables are then supposed to be used later
> within the rule, either to select other variables, or as part of the created
> element.
>
> I am going to try and remove unused bound variables, though. I think I will
> try and write a script to analyse the exported XML for the rules to analyse
> automatically (I have 264 rules written by hand).
>
> Thanks
> Jevon
>
> [1]: J. Wright and J. Dietrich, "Non-Montonic Model Completion in Web
> Application Engineering," in Proceedings of the 21st Australian Software
> Engineering Conference (ASWEC 2010) <http://aswec2010.massey.ac.nz/>,
> Auckland, New Zealand, 2010. http://openiaml.org/#completion
>
>  2010/7/16 Mark Proctor <mproctor at codehaus.org>
>
>>  detail : DetailWire ( (from == source && to == target) || (from == target && to == source) )
>> The above is turned effectively into an MVEL statement, you might get better performance with a ConditionalElement 'or' as lont as the
>> two are mutually exclusive:
>>
>>  ( DetailWire (from == source, to == target ) or
>>    DetailWire (from == target, to == source) )
>>
>> I saw you did this:
>> not ( form : InputForm ( eContainer == container, name == iterator.name ) )
>>
>> The 'form' is not accessible outside the 'not', and that rule does not need it.
>>
>> Is this not a bug. You bind "text". And then i'm not sure what it is you are doing in the second two rules, but it looks wrong.
>> text : InputTextField ( eContainer == form, eval (functions.getAutocompleteInputName(attribute).equals(name)) )
>> onInput : EventTrigger ( text.onInput == onInput
>> currentInput : Property ( text.currentInput == currentInput )
>>
>> It doesn't look like you are updating the session with facts, i.e. it's a stateless session. See if this helps
>>
>> KnowledgeBaseConfiguration kconf = KnowledgeBaseFactory.newKnowledgeBaseConfiguration();
>> kconf.setOption( SequentialOption.YES );
>>
>> KnowledgeBase kbase = KnowledgeBaseFactory.newKnowledgeBase( kconf );
>> final StatelessKnowledgeSession ksession = kbase.newStatelessKnowledgeSession();
>> ksession.execute(....);
>>
>> In the execute you can provie it with a batch of commands to execute, or just a list of objects, up to you. see stateless session for
>> more details.
>>
>> The SequentialOption may help memory, a small mount, if you aren't doing any working memory modifications (insert/modify/update/retract).
>>
>> Mark
>>
>>
>> On 16/07/2010 04:16, Jevon Wright wrote:
>>
>> Hi again,
>>
>> By removing all of the simple eval()s from my rules, I have cut heap usage
>> by at least an order of magnitude. However this still isn't enough.
>>
>> Since I am trying to reduce the cross-product size (as in SQL), I recall
>> that most SQL implementations have a "DESCRIBE SELECT" query which provides
>> real-time information about the complexity of a given SQL query - i.e. the
>> size of the tables, indexes used, and so on. Is there any such tool
>> available for Drools? Are there any tools which can provide clues as to
>> which rules are using the most memory?
>>
>> Alternatively, I am wondering what kind of benefit I could expect from
>> using materialized views to create summary tables; that is, deriving and
>> inserting additional facts. This would allow Drools to rewrite queries that
>> currently use eval(), but would increase the size of working memory, so
>> would this actually save heap size?
>>
>> To what extent does Drools rewrite queries? Is there any documentation
>> describing the approaches used?
>>
>> Any other ideas on how to reduce heap memory usage? I'd appreciate any
>> ideas :)
>>
>> Thanks
>> Jevon
>>
>>
>> On Mon, Jul 12, 2010 at 5:56 PM, Jevon Wright <jevon at jevon.org> wrote:
>>
>>> Hi Wolfgang and Mark,
>>>
>>> Thank you for your replies! You were correct: my eval() functions
>>> could generally be rewritten into Drools directly.
>>>
>>> I had one function "connectsDetail" that was constraining
>>> unidirectional edges, and could be rewritten from:
>>>   detail : DetailWire ( )
>>>  eval ( functions.connectsDetail(detail, source, target) )
>>>
>>>  to:
>>>  detail : DetailWire ( from == source, to == target )
>>>
>>> Another function, "connects", was constraining bidirectional edges,
>>> and could be rewritten from:
>>>  sync : SyncWire( )
>>>  eval ( functions.connects(sync, source, target) )
>>>
>>> to:
>>>  sync : SyncWire( (from == source && to == target) || (from == target
>>> && to == source) )
>>>
>>> Finally, the "veto" function could be rewritten from:
>>>  detail : DetailWire ( )
>>>  eval ( handler.veto(detail) )
>>>
>>> to:
>>>  detail : DetailWire ( overridden == false )
>>>
>>> I took each of these three changes, and evaluated them separately [1].
>>> I found that:
>>>
>>> 1. Inlining 'connectsDetail' made a huge difference - 10-30% faster
>>> execution and 50-60% less allocated heap.
>>> 2. Inlining 'connects' made very little difference - 10-30% faster
>>> execution, but 0-20% more allocated heap.
>>> 3. Inlining 'veto' made no difference - no significant change in
>>> execution speed or allocated heap.
>>>
>>> I think I understand why inlining 'connects' would improve heap usage
>>> - because the rules essentially have more conditionals?
>>>
>>> I also understand why 'veto' made no difference - for most of my test
>>> models, "overridden" was never true, so adding this conditional was
>>> not making the cross product set any smaller.
>>>
>>> Finally, I also tested simply joining all of the rules together into
>>> one file. This happily made no difference at all (although made it
>>> more difficult to edit).
>>>
>>> So I think I can safely conclude that eval() should be used as little
>>> as possible - however, this means that the final rules are made more
>>> complicated and less human-readable, so a DSL may be best for my
>>> common rule patterns in the future.
>>>
>>> Thanks again!
>>> Jevon
>>>
>>> [1]: http://www.jevon.org/wiki/Improving_Drools_Memory_Performance
>>>
>>> On Sat, Jul 10, 2010 at 12:28 AM, Wolfgang Laun <wolfgang.laun at gmail.com>
>>> wrote:
>>> > On 9 July 2010 14:14, Mark Proctor <mproctor at codehaus.org> wrote:
>>> >>  You have many objects there that are not constrained;
>>> >
>>> > I have an inkling that the functions.*() are hiding just these
>>> contraints,
>>> > It's certainly the wrong way, starting with oodles of node pairs, just
>>> to
>>> > pick out connected ones by fishing for the connecting edge. And this
>>> > is worsened by trying to find two such pairs which meet at some
>>> > DomainSource
>>> >
>>> > Guesswork, hopefully educated ;-)
>>> >
>>> > -W
>>> >
>>> >
>>> >> if there are
>>> >> multiple versions of those objects you are going to get massive
>>> amounts
>>> >> of cross products. Think in terms of SQL, each pattern you add is like
>>> >> an SQL join.
>>> >>
>>> >> Mark
>>> >> On 09/07/2010 09:20, Jevon Wright wrote:
>>> >>> Hi everyone,
>>> >>>
>>> >>> I am working on what appears to be a fairly complex rule base based
>>> on
>>> >>> EMF. The rules aren't operating over a huge number of facts (less
>>> than
>>> >>> 10,000 EObjects) and there aren't too many rules (less than 300), but
>>> >>> I am having a problem with running out of Java heap space (set at
>>> ~400
>>> >>> MB).
>>> >>>
>>> >>> Through investigation, I came to the conclusion that this is due to
>>> >>> the design of the rules, rather than the number of facts. The engine
>>> >>> uses less memory inserting many facts that use simple rules, compared
>>> >>> with inserting few facts that use many rules.
>>> >>>
>>> >>> Can anybody suggest some tips for reducing heap memory usage in
>>> >>> Drools? I don't have a time constraint, only a heap/memory
>>> constraint.
>>> >>> A sample rule in my project looks like this:
>>> >>>
>>> >>>    rule "Create QueryParameter for target container of DetailWire"
>>> >>>      when
>>> >>>        container : Frame( )
>>> >>>        schema : DomainSchema ( )
>>> >>>        domainSource : DomainSource ( )
>>> >>>        instance : DomainIterator( )
>>> >>>        selectEdge : SelectEdge ( eval (
>>> >>> functions.connectsSelect(selectEdge, instance, domainSource )) )
>>> >>>        schemaEdge : SchemaEdge ( eval (
>>> >>> functions.connectsSchema(schemaEdge, domainSource, schema )) )
>>> >>>        source : VisibleThing ( eContainer == container )
>>> >>>        target : Frame ( )
>>> >>>        instanceSet : SetWire (
>>> eval(functions.connectsSet(instanceSet,
>>> >>> instance, source )) )
>>> >>>        detail : DetailWire ( )
>>> >>>        eval ( functions.connectsDetail(detail, source, target ))
>>> >>>        pk : DomainAttribute ( eContainer == schema, primaryKey ==
>>> true )
>>> >>>        not ( queryPk : QueryParameter ( eContainer == target, name ==
>>> pk.name ) )
>>> >>>        eval ( handler.veto( detail ))
>>> >>>
>>> >>>      then
>>> >>>        QueryParameter qp = handler.generatedQueryParameter(detail,
>>> target);
>>> >>>        handler.setName(qp, pk.getName());
>>> >>>        queue.add(qp, drools); // wraps insert(...)
>>> >>>
>>> >>>    end
>>> >>>
>>> >>> I try to order the select statements in an order that will reduce the
>>> >>> size of the cross-product (in theory), but I also try and keep the
>>> >>> rules fairly human readable. I try to avoid comparison operators like
>>> >>> <  and>. Analysing a heap dump shows that most of the memory is being
>>> >>> used in StatefulSession.nodeMemories>  PrimitiveLongMap.
>>> >>>
>>> >>> I am using a StatefulSession; if I understand correctly, I can't use
>>> a
>>> >>> StatelessSession with sequential mode since I am inserting facts as
>>> >>> part of the rules. If I also understand correctly, I'd like the Rete
>>> >>> graph to be tall, rather than wide.
>>> >>>
>>> >>> Some ideas I have thought of include the following:
>>> >>> 1. Creating a separate intermediary meta-model to split up the sizes
>>> >>> of the rules. e.g. instead of (if A and B and C then insert D), using
>>> >>> (if A and B then insert E; if E and C then insert D).
>>> >>> 2. Moving eval() statements directly into the Type(...) selectors.
>>> >>> 3. Removing eval() statements. Would this allow for better indexing
>>> by
>>> >>> the Rete algorithm?
>>> >>> 4. Reducing the height, or the width, of the class hierarchy of the
>>> >>> facts. e.g. Removing interfaces or abstract classes to reduce the
>>> >>> possible matches. Would this make a difference?
>>> >>> 5. Conversely, increasing the height, or the width, of the class
>>> >>> hierarchy. e.g. Adding interfaces or abstract classes to reduce field
>>> >>> accessors.
>>> >>> 6. Instead of using EObject.eContainer, creating an explicit
>>> >>> containment property in all of my EObjects.
>>> >>> 7. Creating a DSL that is human-readable, but allows for the
>>> >>> automation of some of these approaches.
>>> >>> 8. Moving all rules into one rule file, or splitting up rules into
>>> >>> smaller files.
>>> >>>
>>> >>> Is there kind of profiler for Drools that will let me see the size
>>> (or
>>> >>> the memory usage) of particular rules, or of the memory used after
>>> >>> inference? Ideally I'd use this to profile any changes.
>>> >>>
>>> >>> Thanks for any thoughts or tips! :-)
>>> >>>
>>> >>> Jevon
>>> >>> _______________________________________________
>>> >>> rules-users mailing list
>>> >>> rules-users at lists.jboss.org
>>> >>> https://lists.jboss.org/mailman/listinfo/rules-users
>>> >>>
>>> >>>
>>> >>
>>> >>
>>> >> _______________________________________________
>>> >> rules-users mailing list
>>> >> rules-users at lists.jboss.org
>>> >> https://lists.jboss.org/mailman/listinfo/rules-users
>>> >>
>>> >
>>> > _______________________________________________
>>> > rules-users mailing list
>>> > rules-users at lists.jboss.org
>>> > https://lists.jboss.org/mailman/listinfo/rules-users
>>> >
>>>
>>
>>
>> _______________________________________________
>> rules-users mailing listrules-users at lists.jboss.orghttps://lists.jboss.org/mailman/listinfo/rules-users
>>
>>
>>
>> _______________________________________________
>> rules-users mailing list
>> rules-users at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/rules-users
>>
>>
>
> _______________________________________________
> rules-users mailing listrules-users at lists.jboss.orghttps://lists.jboss.org/mailman/listinfo/rules-users
>
>
>
> _______________________________________________
> rules-users mailing list
> rules-users at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/rules-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/rules-users/attachments/20100722/d2e2e2b9/attachment.html