[rules-users] Parallel processing of large batches of facts

Tim Jones jones.tim36 at gmail.com
Wed Oct 13 02:41:01 EDT 2010


Interesting suggestions. Couple of questions:

Is drools not internally able to multithread a single execution of
fireallrules?

Does CEP offer performance benefits, or is it just a different way of
structuring the problem?


Cheers,
Tim

2010/10/12 Michael Anstis <michael.anstis at gmail.com>

> Can the aggregation or timestamp range be used to partion your data?
>
> e.g. if you're looking for a data pattern where a fact matches X and Y and
> Z can the most course constraint, say X, not be used to partion?
>
> So you may have pre-processing (to partion the data) before hitting other
> finer grained rules?
>
> Would CEP in stream mode provide an opportunity either?
>
> 2010/10/12 Wolfgang Laun <wolfgang.laun at gmail.com>
>
>  If you have to use a stateful session, with new objects being generated
>> in RHS code and triggering more rules, then you've had it (since there is no
>> way to split the 100k facts).
>>
>> If you don't create new facts in RHS code, you should investigate
>> stateless sessions. It should be more efficient.
>>
>> Also, think about what you RHS code does. Can this processing be delegated
>> to another thread? After a rule has fired, the "fate" of this activation is
>> firmly established; rather than executing (timeconsuming i/o?) operations
>> inline, queue the collected date to a processor thread and let this one
>> crunch it.
>>
>> (I've only thought about this for a few minutes, so there may be other
>> options.)
>>
>> -W
>>
>> 2010/10/12 Tim Jones <jones.tim36 at gmail.com>
>>
>>>   Hello,
>>>
>>>
>>>
>>> I’m working on a project that needs a high performance rules system for
>>> processing batches of objects. Typically I’ll have a dozen or so rules, the
>>> most complex of which will aggregate several objects based on timestamps and
>>> specified data patterns. The objects will come in batches of a few 100ks.
>>> The system is reset back to the starting point after each batch is
>>> processed.
>>>
>>>
>>>
>>> My guess at doing this with Drools is that you load up all the rules and
>>> enter all the objects as “facts”. You then hit fireallrules and sit back and
>>> wait. Doing this, I only get so much performance and I can see that its only
>>> using a single thread. Is there a way to process the whole lot in a
>>> parallel, or multithreaded way? Unfortunately there's no natural way to
>>> partition the objects that would make things easier.
>>>
>>>
>>>
>>> Cheers,
>>>
>>> Tim
>>>
>>> _______________________________________________
>>> rules-users mailing list
>>> rules-users at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/rules-users
>>>
>>>
>>
>> _______________________________________________
>> rules-users mailing list
>> rules-users at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/rules-users
>>
>>
>
> _______________________________________________
> rules-users mailing list
> rules-users at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/rules-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/rules-users/attachments/20101013/8baafa30/attachment.html 


More information about the rules-users mailing list