[rules-users] fact granularity, performance, and other questions

Thu May 28 01:25:07 EDT 2009

I also think that the approach proposed by Greg of having a Message fact
type and various Segment fact types would let you write efficient and
maintainable rules. (The extreme splitting into Field facts has several
drawbacks, as has been pointed out, and also: 'value' is always a string,
so you lose type checking.) One thing I could not see is the effect of
the structural layer of transactions. And another one: are there several
(sub)types of Segment? If so, it will be possible of writing rules against
(abstract) base types, e.g.

abstract class Segment
class DebitSeg extends Segment
class CreditSeg extends Segment

rule notPositive
when
     $s : Segment( amount <= 0 )
then
    weird( $s );

Also, Segment subtypes might implement interfaces, and its
also possible to use them as fact names in patterns.

-W

On Thu, May 28, 2009 at 2:17 AM, David Zeigler <dzeigler at gmail.com> wrote:

> Hi,
> I could use some experienced guidance.  I'm in the process of
> evaluating Drools for use of using in a real-time transactional
> environment to process about 3000 messages/second.  I realize a lot of
> this depends on the type and quantity of rules, hardware, etc.  I'm
> curious what steps others have taken to improve performance and if
> there are any recommendations for my case detailed below.
>
> A few specific questions I have are:
>  - Should each field in a message be a fact? (more info on my message
> below)  What fact granularity have you settled on in your usage and
> why?
>  - Does the order of the conditions in a rule affect performance, the
> execution order, or the structure of the Rete network?
>  - Does the order the facts are inserted into a stateless session (as
> a list via the CommandFactory.newInsertElements) affect performance at
> all?
>
> The message is an EDI format and will typically have anywhere from 80
> to 200 fields, potentially more.  The message is divided into
> transactions, then segments, then fields.  We have an object model
> that represents the message.  We're using a stateless session and most
> of the rules will modify fields or add fields and segments based on
> the values of other fields.  Currently, I flatten the object model
> into a List containing the message, transactions, segments, and each
> field, and then insert the list into the stateless session and fire.
> I'm avoiding sequential mode for now until we have a better idea of
> our requirements.
>
> Here's a simplified example of what I'm doing now (using json-esque
> syntax instead of the EDI format).
> message {
>  segment {
>    SG:header
>    A0:agents
>  }
>  segment {
>    SG:agent
>    A1:000
>    A2:JAMES
>    A3:BOND
>  }
>  segment {
>    SG:agent
>    A1:86
>    A2:MAXWELL
>    A3:SMART
>  }
> }
>
> Each field has an id and a value.  For this message, I would insert 14
> facts:  the message object, 3 segments objects, and 10 field objects.
>
> rule "set James Bond's A1 to 007"
> when
>  $a1 : Field(id == "A1",  value != "007")
>  Field(id == "A2",  value == "JAMES")
>  Field(id == "A3",  value == "BOND")
> then
>  $a1.setValue("007");
>  update($a1);
> end
>
> A relatively more complicated, but typical rule would be "set James
> Bond's A1 to 007 iff he's the second agent in the message and one of
> the other agents' first names does not contain AUSTIN and an agency
> segment exists"
>
> The above example assumes each segment and field is a fact and I think
> it's a clean and flexible approach, but I'm concerned about the
> overhead of inserting potentially 250 facts for each message.  The
> only other alternatives I can think of seem to have their own set of
> problems:
> 1. limit the fields that rules can be written against to a limited
> subset, which may not be feasible depending on how the requirements
> evolve, and only assert those as facts.  Doing this seems to double
> the number of transactions per second in my nonscientific benchmark.
> 2. insert the Message object as a single fact and then write a slew of
> accessor methods in that object to get at all possible fields in the
> tree: getA1FromSecondAgentSegmentInFirstTransaction().  This seems
> like it might perform well, but could be very messy.
> 3. provide an api in the message object model to find various
> occurrences of fields in the messages, then use eval() in the rule.
> like eval(msg.findSegment("agent", secondOccurrence).getField("A1")).
> I've read that would be less efficient once the ruleset grows.
>
> I'm sure many of you have dealt with this type scenario before, what
> did you determine the best approach to be?
>
> Thanks,
> David
> _______________________________________________
> rules-users mailing list
> rules-users at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/rules-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/rules-users/attachments/20090528/d411136a/attachment.html