[rules-users] Parallel processing of large batches of facts

Michael Anstis michael.anstis at gmail.com
Wed Oct 13 06:24:17 EDT 2010


Fusion uses exactly the same principles as the core engine; however the
event (facts of type "event") "life-cycle" is managed by the engine. You'd
still need to partition to get parallelism otherwise the possibility of
dependencies between rules\facts\events in each process still arise.

2010/10/13 Tim Jones <jones.tim36 at gmail.com>

> I wasn't sure what was possible as I didn't know the inner workings of the
> algorithm. Thanks, you've cleared up my understanding quite a bit.
>
> Do you happen to know if Fusion is any different? Or is it still a case of
> having to partition the data/streams to get parallelism?
>
>
> Cheers,
> Tim
>
> 2010/10/13 Swindells, Thomas <TSwindells at nds.com>
>
>  How would you expect Drools to multithread?
>>
>>
>>
>> From what I understand drools operates in two steps:
>>
>> 1.       When facts are inserted drools constructs and update Rete graph,
>> generating an ordered list of ‘activations’ of rules (and the corresponding
>> data) which can be fired.
>>
>> There is a possibility here that inserts from multiple threads could be
>> made thread safe (if it isn’t already).
>>
>>
>>
>> 2.       When fireAllRules is called it loops over the activation list,
>> executing the then part of the rule. This execution may insert or update
>> facts causing activations to be added or removed.
>>
>>
>>
>> Because rules can alter the activation of other rules there isn’t any easy
>> way to perform multithreading without introducing race conditions and
>> errors. You can’t execute two activations concurrently because the first
>> (higher priority) activation may perform an update which invalidates the
>> other activation or invalidates the preconditions which would almost
>> certainly cause null pointer exceptions and the like.
>>
>>
>>
>> Of course if your actions don’t have side effects (as far as rules is
>> concerned) then there is nothing stopping your actions from dispatching
>> tasks to a worker thread pool which could then perform the actions
>> concurrently.
>>
>>
>>
>> Thomas
>>
>>
>>
>> *From:* rules-users-bounces at lists.jboss.org [mailto:
>> rules-users-bounces at lists.jboss.org] *On Behalf Of *Tim Jones
>> *Sent:* 13 October 2010 07:41
>> *To:* Rules Users List
>> *Subject:* Re: [rules-users] Parallel processing of large batches of
>> facts
>>
>>
>>
>> Interesting suggestions. Couple of questions:
>>
>>
>>
>> Is drools not internally able to multithread a single execution of
>> fireallrules?
>>
>>
>>
>> Does CEP offer performance benefits, or is it just a different way of
>> structuring the problem?
>>
>>
>>
>>
>>
>> Cheers,
>>
>> Tim
>>
>> 2010/10/12 Michael Anstis <michael.anstis at gmail.com>
>>
>> Can the aggregation or timestamp range be used to partion your data?
>>
>> e.g. if you're looking for a data pattern where a fact matches X and Y and
>> Z can the most course constraint, say X, not be used to partion?
>>
>> So you may have pre-processing (to partion the data) before hitting other
>> finer grained rules?
>>
>> Would CEP in stream mode provide an opportunity either?
>>
>> 2010/10/12 Wolfgang Laun <wolfgang.laun at gmail.com>
>>
>>
>>
>> If you have to use a stateful session, with new objects being generated in
>> RHS code and triggering more rules, then you've had it (since there is no
>> way to split the 100k facts).
>>
>> If you don't create new facts in RHS code, you should investigate
>> stateless sessions. It should be more efficient.
>>
>> Also, think about what you RHS code does. Can this processing be delegated
>> to another thread? After a rule has fired, the "fate" of this activation is
>> firmly established; rather than executing (timeconsuming i/o?) operations
>> inline, queue the collected date to a processor thread and let this one
>> crunch it.
>>
>> (I've only thought about this for a few minutes, so there may be other
>> options.)
>>
>> -W
>>
>> 2010/10/12 Tim Jones <jones.tim36 at gmail.com>
>>
>> Hello,
>>
>>
>>
>> I’m working on a project that needs a high performance rules system for
>> processing batches of objects. Typically I’ll have a dozen or so rules, the
>> most complex of which will aggregate several objects based on timestamps and
>> specified data patterns. The objects will come in batches of a few 100ks.
>> The system is reset back to the starting point after each batch is
>> processed.
>>
>>
>>
>> My guess at doing this with Drools is that you load up all the rules and
>> enter all the objects as “facts”. You then hit fireallrules and sit back and
>> wait. Doing this, I only get so much performance and I can see that its only
>> using a single thread. Is there a way to process the whole lot in a
>> parallel, or multithreaded way? Unfortunately there's no natural way to
>> partition the objects that would make things easier.
>>
>>
>>
>>
>>
>> Cheers,
>>
>> Tim
>>
>>
>>
>> _______________________________________________
>> rules-users mailing list
>> rules-users at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/rules-users
>>
>>
>>
>> _______________________________________________
>> rules-users mailing list
>> rules-users at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/rules-users
>>
>>
>>
>>
>> _______________________________________________
>> rules-users mailing list
>> rules-users at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/rules-users
>>
>>
>>
>> ------------------------------
>>
>>
>> **************************************************************************************
>> This message is confidential and intended only for the addressee. If you
>> have received this message in error, please immediately notify the
>> postmaster at nds.com and delete it from your system as well as any copies.
>> The content of e-mails as well as traffic data may be monitored by NDS for
>> employment and security purposes. To protect the environment please do not
>> print this e-mail unless necessary.
>>
>> NDS Limited. Registered Office: One London Road, Staines, Middlesex, TW18
>> 4EX, United Kingdom. A company registered in England and Wales. Registered
>> no. 3080780. VAT no. GB 603 8808 40-00
>>
>> **************************************************************************************
>>
>> _______________________________________________
>> rules-users mailing list
>> rules-users at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/rules-users
>>
>>
>
> _______________________________________________
> rules-users mailing list
> rules-users at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/rules-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/rules-users/attachments/20101013/931cce9f/attachment.html 


More information about the rules-users mailing list