Fusion uses exactly the same principles as the core engine; however the event (facts of type "event") "life-cycle" is managed by the engine. You'd still need to partition to get parallelism otherwise the possibility of dependencies between rules\facts\events in each process still arise.
I wasn't sure what was possible as I didn't know the inner workings of the algorithm. Thanks, you've cleared up my understanding quite a bit.
Do you happen to know if Fusion is any different? Or is it still a case of having to partition the data/streams to get parallelism?
Cheers,
Tim2010/10/13 Swindells, Thomas <TSwindells@nds.com>How would you expect Drools to multithread?
From what I understand drools operates in two steps:
1. When facts are inserted drools constructs and update Rete graph, generating an ordered list of ‘activations’ of rules (and the corresponding data) which can be fired.
There is a possibility here that inserts from multiple threads could be made thread safe (if it isn’t already).
2. When fireAllRules is called it loops over the activation list, executing the then part of the rule. This execution may insert or update facts causing activations to be added or removed.
Because rules can alter the activation of other rules there isn’t any easy way to perform multithreading without introducing race conditions and errors. You can’t execute two activations concurrently because the first (higher priority) activation may perform an update which invalidates the other activation or invalidates the preconditions which would almost certainly cause null pointer exceptions and the like.
Of course if your actions don’t have side effects (as far as rules is concerned) then there is nothing stopping your actions from dispatching tasks to a worker thread pool which could then perform the actions concurrently.
Thomas
From: rules-users-bounces@lists.jboss.org [mailto:rules-users-bounces@lists.jboss.org] On Behalf Of Tim Jones
Sent: 13 October 2010 07:41
To: Rules Users List
Subject: Re: [rules-users] Parallel processing of large batches of facts
Interesting suggestions. Couple of questions:
Is drools not internally able to multithread a single execution of fireallrules?
Does CEP offer performance benefits, or is it just a different way of structuring the problem?
Cheers,
Tim
2010/10/12 Michael Anstis <michael.anstis@gmail.com>
Can the aggregation or timestamp range be used to partion your data?
e.g. if you're looking for a data pattern where a fact matches X and Y and Z can the most course constraint, say X, not be used to partion?
So you may have pre-processing (to partion the data) before hitting other finer grained rules?
Would CEP in stream mode provide an opportunity either?
2010/10/12 Wolfgang Laun <wolfgang.laun@gmail.com>
If you have to use a stateful session, with new objects being generated in RHS code and triggering more rules, then you've had it (since there is no way to split the 100k facts).
If you don't create new facts in RHS code, you should investigate stateless sessions. It should be more efficient.
Also, think about what you RHS code does. Can this processing be delegated to another thread? After a rule has fired, the "fate" of this activation is firmly established; rather than executing (timeconsuming i/o?) operations inline, queue the collected date to a processor thread and let this one crunch it.
(I've only thought about this for a few minutes, so there may be other options.)
-W2010/10/12 Tim Jones <jones.tim36@gmail.com>
Hello,
I’m working on a project that needs a high performance rules system for processing batches of objects. Typically I’ll have a dozen or so rules, the most complex of which will aggregate several objects based on timestamps and specified data patterns. The objects will come in batches of a few 100ks. The system is reset back to the starting point after each batch is processed.
My guess at doing this with Drools is that you load up all the rules and enter all the objects as “facts”. You then hit fireallrules and sit back and wait. Doing this, I only get so much performance and I can see that its only using a single thread. Is there a way to process the whole lot in a parallel, or multithreaded way? Unfortunately there's no natural way to partition the objects that would make things easier.
Cheers,
Tim
_______________________________________________
rules-users mailing list
rules-users@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users
_______________________________________________
rules-users mailing list
rules-users@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users
_______________________________________________
rules-users mailing list
rules-users@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users
**************************************************************************************
This message is confidential and intended only for the addressee. If you have received this message in error, please immediately notify the postmaster@nds.com and delete it from your system as well as any copies. The content of e-mails as well as traffic data may be monitored by NDS for employment and security purposes. To protect the environment please do not print this e-mail unless necessary.
NDS Limited. Registered Office: One London Road, Staines, Middlesex, TW18 4EX, United Kingdom. A company registered in England and Wales. Registered no. 3080780. VAT no. GB 603 8808 40-00
**************************************************************************************
_______________________________________________
rules-users mailing list
rules-users@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users
_______________________________________________
rules-users mailing list
rules-users@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users