[rules-users] Parallel processing of large batches of facts

Michael Anstis michael.anstis at gmail.com
Wed Oct 13 03:57:07 EDT 2010


Hi Tim,

I don't believe Drools internals provides multi-threading under "usual"
operation.

If you start to use Timers and Calendars in your rules I have a strong
suspicion that it will; however this is an edge case.

CEP has the benefit of automatically garbage collecting facts that can no
longer match patterns; so if your rules reason over a sliding window some of
your 100K's facts may have been purged from WM keeping resource usage lower
than perhaps a stateless session. You could use your "timestamp" field as
the "@timestamp" meta annotation.

I can't say whether this would deliver performance improvements as I feel
any benefit could depend upon the shape or your data; e.g. is the
aggregation over a relatively small range of timestamps vs the whole range
of the batch (i.e. batch represents 6 hours but your rules look for groups
within 5 minute windows). Ordering the facts by timestamp before insertion
would theoretically preclude the need to have them all in WM at once.

Edson is the CEP expert however he's at Rules Fest at the moment so I doubt
able to answer until he comes back.

Possibly helpful, possibly the rant of somebody miss-informed.

With kind regards,

Mike

2010/10/13 Tim Jones <jones.tim36 at gmail.com>

> Interesting suggestions. Couple of questions:
>
> Is drools not internally able to multithread a single execution of
> fireallrules?
>
> Does CEP offer performance benefits, or is it just a different way of
> structuring the problem?
>
>
> Cheers,
> Tim
>
> 2010/10/12 Michael Anstis <michael.anstis at gmail.com>
>
>> Can the aggregation or timestamp range be used to partion your data?
>>
>> e.g. if you're looking for a data pattern where a fact matches X and Y and
>> Z can the most course constraint, say X, not be used to partion?
>>
>> So you may have pre-processing (to partion the data) before hitting other
>> finer grained rules?
>>
>> Would CEP in stream mode provide an opportunity either?
>>
>> 2010/10/12 Wolfgang Laun <wolfgang.laun at gmail.com>
>>
>>  If you have to use a stateful session, with new objects being generated
>>> in RHS code and triggering more rules, then you've had it (since there is no
>>> way to split the 100k facts).
>>>
>>> If you don't create new facts in RHS code, you should investigate
>>> stateless sessions. It should be more efficient.
>>>
>>> Also, think about what you RHS code does. Can this processing be
>>> delegated to another thread? After a rule has fired, the "fate" of this
>>> activation is firmly established; rather than executing (timeconsuming i/o?)
>>> operations inline, queue the collected date to a processor thread and let
>>> this one crunch it.
>>>
>>> (I've only thought about this for a few minutes, so there may be other
>>> options.)
>>>
>>> -W
>>>
>>> 2010/10/12 Tim Jones <jones.tim36 at gmail.com>
>>>
>>>>   Hello,
>>>>
>>>>
>>>>
>>>> I’m working on a project that needs a high performance rules system for
>>>> processing batches of objects. Typically I’ll have a dozen or so rules, the
>>>> most complex of which will aggregate several objects based on timestamps and
>>>> specified data patterns. The objects will come in batches of a few 100ks.
>>>> The system is reset back to the starting point after each batch is
>>>> processed.
>>>>
>>>>
>>>>
>>>> My guess at doing this with Drools is that you load up all the rules and
>>>> enter all the objects as “facts”. You then hit fireallrules and sit back and
>>>> wait. Doing this, I only get so much performance and I can see that its only
>>>> using a single thread. Is there a way to process the whole lot in a
>>>> parallel, or multithreaded way? Unfortunately there's no natural way to
>>>> partition the objects that would make things easier.
>>>>
>>>>
>>>>
>>>> Cheers,
>>>>
>>>> Tim
>>>>
>>>> _______________________________________________
>>>> rules-users mailing list
>>>> rules-users at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/rules-users
>>>>
>>>>
>>>
>>> _______________________________________________
>>> rules-users mailing list
>>> rules-users at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/rules-users
>>>
>>>
>>
>> _______________________________________________
>> rules-users mailing list
>> rules-users at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/rules-users
>>
>>
>
> _______________________________________________
> rules-users mailing list
> rules-users at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/rules-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/rules-users/attachments/20101013/a814de64/attachment.html 


More information about the rules-users mailing list