[rules-users] Improving Drools Memory Performance

Thu Jul 15 23:33:55 EDT 2010

  On 16/07/2010 04:16, Jevon Wright wrote:
> Hi again,
>
> By removing all of the simple eval()s from my rules, I have cut heap 
> usage by at least an order of magnitude. However this still isn't enough.
>
> Since I am trying to reduce the cross-product size (as in SQL), I 
> recall that most SQL implementations have a "DESCRIBE SELECT" query 
> which provides real-time information about the complexity of a given 
> SQL query - i.e. the size of the tables, indexes used, and so on. Is 
> there any such tool available for Drools? Are there any tools which 
> can provide clues as to which rules are using the most memory?
no, but sounds like a nice side project :)
>
> Alternatively, I am wondering what kind of benefit I could expect from 
> using materialized views to create summary tables; that is, deriving 
> and inserting additional facts. This would allow Drools to rewrite 
> queries that currently use eval(), but would increase the size of 
> working memory, so would this actually save heap size?
>
> To what extent does Drools rewrite queries? Is there any documentation 
> describing the approaches used?
The only rewriting we do is on literal constraints, moving == ones to 
the front, that's it. Also make sure you are using trunk, which is more 
memory efficient for modifies:
https://hudson.jboss.org/hudson/job/drools/lastSuccessfulBuild/artifact/trunk/target/

Mark
>
> Any other ideas on how to reduce heap memory usage? I'd appreciate any 
> ideas :)
>
> Thanks
> Jevon
>
>
> On Mon, Jul 12, 2010 at 5:56 PM, Jevon Wright <jevon at jevon.org 
> <mailto:jevon at jevon.org>> wrote:
>
>     Hi Wolfgang and Mark,
>
>     Thank you for your replies! You were correct: my eval() functions
>     could generally be rewritten into Drools directly.
>
>     I had one function "connectsDetail" that was constraining
>     unidirectional edges, and could be rewritten from:
>      detail : DetailWire ( )
>      eval ( functions.connectsDetail(detail, source, target) )
>
>     to:
>      detail : DetailWire ( from == source, to == target )
>
>     Another function, "connects", was constraining bidirectional edges,
>     and could be rewritten from:
>      sync : SyncWire( )
>      eval ( functions.connects(sync, source, target) )
>
>     to:
>      sync : SyncWire( (from == source && to == target) || (from == target
>     && to == source) )
>
>     Finally, the "veto" function could be rewritten from:
>      detail : DetailWire ( )
>      eval ( handler.veto(detail) )
>
>     to:
>      detail : DetailWire ( overridden == false )
>
>     I took each of these three changes, and evaluated them separately [1].
>     I found that:
>
>     1. Inlining 'connectsDetail' made a huge difference - 10-30% faster
>     execution and 50-60% less allocated heap.
>     2. Inlining 'connects' made very little difference - 10-30% faster
>     execution, but 0-20% more allocated heap.
>     3. Inlining 'veto' made no difference - no significant change in
>     execution speed or allocated heap.
>
>     I think I understand why inlining 'connects' would improve heap usage
>     - because the rules essentially have more conditionals?
>
>     I also understand why 'veto' made no difference - for most of my test
>     models, "overridden" was never true, so adding this conditional was
>     not making the cross product set any smaller.
>
>     Finally, I also tested simply joining all of the rules together into
>     one file. This happily made no difference at all (although made it
>     more difficult to edit).
>
>     So I think I can safely conclude that eval() should be used as little
>     as possible - however, this means that the final rules are made more
>     complicated and less human-readable, so a DSL may be best for my
>     common rule patterns in the future.
>
>     Thanks again!
>     Jevon
>
>     [1]: http://www.jevon.org/wiki/Improving_Drools_Memory_Performance
>
>     On Sat, Jul 10, 2010 at 12:28 AM, Wolfgang Laun
>     <wolfgang.laun at gmail.com <mailto:wolfgang.laun at gmail.com>> wrote:
>     > On 9 July 2010 14:14, Mark Proctor <mproctor at codehaus.org
>     <mailto:mproctor at codehaus.org>> wrote:
>     >>  You have many objects there that are not constrained;
>     >
>     > I have an inkling that the functions.*() are hiding just these
>     contraints,
>     > It's certainly the wrong way, starting with oodles of node
>     pairs, just to
>     > pick out connected ones by fishing for the connecting edge. And this
>     > is worsened by trying to find two such pairs which meet at some
>     > DomainSource
>     >
>     > Guesswork, hopefully educated ;-)
>     >
>     > -W
>     >
>     >
>     >> if there are
>     >> multiple versions of those objects you are going to get massive
>     amounts
>     >> of cross products. Think in terms of SQL, each pattern you add
>     is like
>     >> an SQL join.
>     >>
>     >> Mark
>     >> On 09/07/2010 09:20, Jevon Wright wrote:
>     >>> Hi everyone,
>     >>>
>     >>> I am working on what appears to be a fairly complex rule base
>     based on
>     >>> EMF. The rules aren't operating over a huge number of facts
>     (less than
>     >>> 10,000 EObjects) and there aren't too many rules (less than
>     300), but
>     >>> I am having a problem with running out of Java heap space (set
>     at ~400
>     >>> MB).
>     >>>
>     >>> Through investigation, I came to the conclusion that this is
>     due to
>     >>> the design of the rules, rather than the number of facts. The
>     engine
>     >>> uses less memory inserting many facts that use simple rules,
>     compared
>     >>> with inserting few facts that use many rules.
>     >>>
>     >>> Can anybody suggest some tips for reducing heap memory usage in
>     >>> Drools? I don't have a time constraint, only a heap/memory
>     constraint.
>     >>> A sample rule in my project looks like this:
>     >>>
>     >>>    rule "Create QueryParameter for target container of DetailWire"
>     >>>      when
>     >>>        container : Frame( )
>     >>>        schema : DomainSchema ( )
>     >>>        domainSource : DomainSource ( )
>     >>>        instance : DomainIterator( )
>     >>>        selectEdge : SelectEdge ( eval (
>     >>> functions.connectsSelect(selectEdge, instance, domainSource )) )
>     >>>        schemaEdge : SchemaEdge ( eval (
>     >>> functions.connectsSchema(schemaEdge, domainSource, schema )) )
>     >>>        source : VisibleThing ( eContainer == container )
>     >>>        target : Frame ( )
>     >>>        instanceSet : SetWire (
>     eval(functions.connectsSet(instanceSet,
>     >>> instance, source )) )
>     >>>        detail : DetailWire ( )
>     >>>        eval ( functions.connectsDetail(detail, source, target ))
>     >>>        pk : DomainAttribute ( eContainer == schema, primaryKey
>     == true )
>     >>>        not ( queryPk : QueryParameter ( eContainer == target,
>     name == pk.name <http://pk.name> ) )
>     >>>        eval ( handler.veto( detail ))
>     >>>
>     >>>      then
>     >>>        QueryParameter qp =
>     handler.generatedQueryParameter(detail, target);
>     >>>        handler.setName(qp, pk.getName());
>     >>>        queue.add(qp, drools); // wraps insert(...)
>     >>>
>     >>>    end
>     >>>
>     >>> I try to order the select statements in an order that will
>     reduce the
>     >>> size of the cross-product (in theory), but I also try and keep the
>     >>> rules fairly human readable. I try to avoid comparison
>     operators like
>     >>> <  and>. Analysing a heap dump shows that most of the memory
>     is being
>     >>> used in StatefulSession.nodeMemories>  PrimitiveLongMap.
>     >>>
>     >>> I am using a StatefulSession; if I understand correctly, I
>     can't use a
>     >>> StatelessSession with sequential mode since I am inserting
>     facts as
>     >>> part of the rules. If I also understand correctly, I'd like
>     the Rete
>     >>> graph to be tall, rather than wide.
>     >>>
>     >>> Some ideas I have thought of include the following:
>     >>> 1. Creating a separate intermediary meta-model to split up the
>     sizes
>     >>> of the rules. e.g. instead of (if A and B and C then insert
>     D), using
>     >>> (if A and B then insert E; if E and C then insert D).
>     >>> 2. Moving eval() statements directly into the Type(...) selectors.
>     >>> 3. Removing eval() statements. Would this allow for better
>     indexing by
>     >>> the Rete algorithm?
>     >>> 4. Reducing the height, or the width, of the class hierarchy
>     of the
>     >>> facts. e.g. Removing interfaces or abstract classes to reduce the
>     >>> possible matches. Would this make a difference?
>     >>> 5. Conversely, increasing the height, or the width, of the class
>     >>> hierarchy. e.g. Adding interfaces or abstract classes to
>     reduce field
>     >>> accessors.
>     >>> 6. Instead of using EObject.eContainer, creating an explicit
>     >>> containment property in all of my EObjects.
>     >>> 7. Creating a DSL that is human-readable, but allows for the
>     >>> automation of some of these approaches.
>     >>> 8. Moving all rules into one rule file, or splitting up rules into
>     >>> smaller files.
>     >>>
>     >>> Is there kind of profiler for Drools that will let me see the
>     size (or
>     >>> the memory usage) of particular rules, or of the memory used after
>     >>> inference? Ideally I'd use this to profile any changes.
>     >>>
>     >>> Thanks for any thoughts or tips! :-)
>     >>>
>     >>> Jevon
>     >>> _______________________________________________
>     >>> rules-users mailing list
>     >>> rules-users at lists.jboss.org <mailto:rules-users at lists.jboss.org>
>     >>> https://lists.jboss.org/mailman/listinfo/rules-users
>     >>>
>     >>>
>     >>
>     >>
>     >> _______________________________________________
>     >> rules-users mailing list
>     >> rules-users at lists.jboss.org <mailto:rules-users at lists.jboss.org>
>     >> https://lists.jboss.org/mailman/listinfo/rules-users
>     >>
>     >
>     > _______________________________________________
>     > rules-users mailing list
>     > rules-users at lists.jboss.org <mailto:rules-users at lists.jboss.org>
>     > https://lists.jboss.org/mailman/listinfo/rules-users
>     >
>
>
>
> _______________________________________________
> rules-users mailing list
> rules-users at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/rules-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/rules-users/attachments/20100716/bdac7d7f/attachment.html