[rules-users] Improving Drools Memory Performance

Mon Jul 12 02:51:42 EDT 2010

For #2 try a custom operator for "connects".

GreG

On Jul 12, 2010, at 12:56 AM, Jevon Wright <jevon at jevon.org> wrote:

Hi Wolfgang and Mark,

Thank you for your replies! You were correct: my eval() functions
could generally be rewritten into Drools directly.

I had one function "connectsDetail" that was constraining
unidirectional edges, and could be rewritten from:
 detail : DetailWire ( )
 eval ( functions.connectsDetail(detail, source, target) )

to:
 detail : DetailWire ( from == source, to == target )

Another function, "connects", was constraining bidirectional edges,
and could be rewritten from:
 sync : SyncWire( )
 eval ( functions.connects(sync, source, target) )

to:
 sync : SyncWire( (from == source && to == target) || (from == target
&& to == source) )

Finally, the "veto" function could be rewritten from:
 detail : DetailWire ( )
 eval ( handler.veto(detail) )

to:
 detail : DetailWire ( overridden == false )

I took each of these three changes, and evaluated them separately [1].
I found that:

1. Inlining 'connectsDetail' made a huge difference - 10-30% faster
execution and 50-60% less allocated heap.
2. Inlining 'connects' made very little difference - 10-30% faster
execution, but 0-20% more allocated heap.
3. Inlining 'veto' made no difference - no significant change in
execution speed or allocated heap.

I think I understand why inlining 'connects' would improve heap usage
- because the rules essentially have more conditionals?

I also understand why 'veto' made no difference - for most of my test
models, "overridden" was never true, so adding this conditional was
not making the cross product set any smaller.

Finally, I also tested simply joining all of the rules together into
one file. This happily made no difference at all (although made it
more difficult to edit).

So I think I can safely conclude that eval() should be used as little
as possible - however, this means that the final rules are made more
complicated and less human-readable, so a DSL may be best for my
common rule patterns in the future.

Thanks again!
Jevon

[1]: http://www.jevon.org/wiki/Improving_Drools_Memory_Performance

On Sat, Jul 10, 2010 at 12:28 AM, Wolfgang Laun <wolfgang.laun at gmail.com> wrote:
On 9 July 2010 14:14, Mark Proctor <mproctor at codehaus.org> wrote:
 You have many objects there that are not constrained;

I have an inkling that the functions.*() are hiding just these contraints,
It's certainly the wrong way, starting with oodles of node pairs, just to
pick out connected ones by fishing for the connecting edge. And this
is worsened by trying to find two such pairs which meet at some
DomainSource

Guesswork, hopefully educated ;-)

-W

if there are
multiple versions of those objects you are going to get massive amounts
of cross products. Think in terms of SQL, each pattern you add is like
an SQL join.

Mark
On 09/07/2010 09:20, Jevon Wright wrote:
Hi everyone,

I am working on what appears to be a fairly complex rule base based on
EMF. The rules aren't operating over a huge number of facts (less than
10,000 EObjects) and there aren't too many rules (less than 300), but
I am having a problem with running out of Java heap space (set at ~400
MB).

Through investigation, I came to the conclusion that this is due to
the design of the rules, rather than the number of facts. The engine
uses less memory inserting many facts that use simple rules, compared
with inserting few facts that use many rules.

Can anybody suggest some tips for reducing heap memory usage in
Drools? I don't have a time constraint, only a heap/memory constraint.
A sample rule in my project looks like this:

   rule "Create QueryParameter for target container of DetailWire"
     when
       container : Frame( )
       schema : DomainSchema ( )
       domainSource : DomainSource ( )
       instance : DomainIterator( )
       selectEdge : SelectEdge ( eval (
functions.connectsSelect(selectEdge, instance, domainSource )) )
       schemaEdge : SchemaEdge ( eval (
functions.connectsSchema(schemaEdge, domainSource, schema )) )
       source : VisibleThing ( eContainer == container )
       target : Frame ( )
       instanceSet : SetWire ( eval(functions.connectsSet(instanceSet,
instance, source )) )
       detail : DetailWire ( )
       eval ( functions.connectsDetail(detail, source, target ))
       pk : DomainAttribute ( eContainer == schema, primaryKey == true )
       not ( queryPk : QueryParameter ( eContainer == target, name == pk.name ) )
       eval ( handler.veto( detail ))

     then
       QueryParameter qp = handler.generatedQueryParameter(detail, target);
       handler.setName(qp, pk.getName());
       queue.add(qp, drools); // wraps insert(...)

   end

I try to order the select statements in an order that will reduce the
size of the cross-product (in theory), but I also try and keep the
rules fairly human readable. I try to avoid comparison operators like
<  and>. Analysing a heap dump shows that most of the memory is being
used in StatefulSession.nodeMemories>  PrimitiveLongMap.

I am using a StatefulSession; if I understand correctly, I can't use a
StatelessSession with sequential mode since I am inserting facts as
part of the rules. If I also understand correctly, I'd like the Rete
graph to be tall, rather than wide.

Some ideas I have thought of include the following:
1. Creating a separate intermediary meta-model to split up the sizes
of the rules. e.g. instead of (if A and B and C then insert D), using
(if A and B then insert E; if E and C then insert D).
2. Moving eval() statements directly into the Type(...) selectors.
3. Removing eval() statements. Would this allow for better indexing by
the Rete algorithm?
4. Reducing the height, or the width, of the class hierarchy of the
facts. e.g. Removing interfaces or abstract classes to reduce the
possible matches. Would this make a difference?
5. Conversely, increasing the height, or the width, of the class
hierarchy. e.g. Adding interfaces or abstract classes to reduce field
accessors.
6. Instead of using EObject.eContainer, creating an explicit
containment property in all of my EObjects.
7. Creating a DSL that is human-readable, but allows for the
automation of some of these approaches.
8. Moving all rules into one rule file, or splitting up rules into
smaller files.

Is there kind of profiler for Drools that will let me see the size (or
the memory usage) of particular rules, or of the memory used after
inference? Ideally I'd use this to profile any changes.

Thanks for any thoughts or tips! :-)

Jevon
_______________________________________________
rules-users mailing list
rules-users at lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users

_______________________________________________
rules-users mailing list
rules-users at lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users

_______________________________________________
rules-users mailing list
rules-users at lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users

_______________________________________________
rules-users mailing list
rules-users at lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users