[rules-users] Drools Expert scalability

Swindells, Thomas TSwindells at nds.com
Thu Aug 12 08:00:17 EDT 2010


Why not send the batch file in all in one go? It is likely to be much more efficient? If your criteria are contained within a single batch file then it solves the interdependent fact problem.

If you have interdependent facts and want to run them one fact as a time but don't know how your facts are interdependent then you've got problems as the only solution is to run a totally linear process.

Thomas

From: rules-users-bounces at lists.jboss.org [mailto:rules-users-bounces at lists.jboss.org] On Behalf Of Dieter D'haeyere
Sent: 12 August 2010 12:47
To: Rules Users List
Subject: Re: [rules-users] Drools Expert scalability

Well, the batch file wouldn't be delivered as a whole to the rule engine, the app would parse the file and send cases one at a time to the rule engine(s).

To dispatch to the rule engine instances, one would need to know what cases are handled by the running engine(s) before one can decide to which engine the new case can be sent
The criterion on actor id was just a sample, it could change over time ... so that's why I think it is hard to say that scalability is done by adding more server instances.

Load (for one application several are using the rule engine): +/- 100K requests per hour.

Of course, if the engine is fast enough to evaluate everything in a single instance, there is no problem at all :)
buy some faster hardware ? :)



2010/8/12 Wolfgang Laun <wolfgang.laun at gmail.com<mailto:wolfgang.laun at gmail.com>>
Where is the limit for mutual case dependency? OK, you can't split a batch,
but will the "multiple batch files per day" be independent of each other?

If "actor" is the only criterion for dependencies, then you could split
cases to different rule engines (on different CPUs) by actor id (odd-even)
or some similar thing. Preprocessing to determine the server target could
also be done by rules :-)

I'd rather not try and go for engine pipelining. Transferring data from
one stage to the next causes overhead, not only for the transport but also
for WME bookkeeping and re-evaluation of patterns that are the same
in stage N and N+1, and which could be factored into a single Rete.

How many cases per day are there? How much delay is acceptable
between starting a batch and obtaining the results? The 50k alone
aren't sufficient to say that there is "an issue".

-W
2010/8/12 Dieter D'haeyere <dieter.dhaeyere at gmail.com<mailto:dieter.dhaeyere at gmail.com>>
I have a question regarding the scalability of Drools Expert.

Situation is as follows:
- Individual cases have to be evaluated by the rule engine (result is a tarrif that could be refunded).
- the application receives the cases in flat files with 50K cases each.  The app parses them and sends them one by one to the rule engine
- the application receives multiple batch files per day.
- seperate cases can be dependant on each other.  Eg. if max 1 refund can be done to an actor, two cases about the same actor can not run in parallel.
- loads will be huge, scalability is an issue

So, what I see now
- Drools Server can be run as a server
- It is possible to have multiple instances of drools server, this would allow to evaluate cases in parallel.  But ... this can cause problems (as stated before): you can't run any two cases in parallel.  Preprocessing could be done by the application (eg. determining the order in which to present the cases to the rule engine) but over time extra constraints can appear so preprocessing would have to be maintained continuously.

So, this is an issue :)

My questions are:
- is it eg possible to organise rules such that different rules can run on different rule engines.  Like that you could have some kind of pipeline.
Maybe this could be defined with ruleflow ? Compare to the pipelines in CPU's (for executing machine code).  Could you eg. relate one server instance to one set of rules and another server instance to the sequential next set of rules ?  How would you configure that (performance is of course important).
- Or, is it possible to split rules but not as a pipeline but in parallel... like a 'fork .. join' .  Again, this could be configured with ruleflow?
- What do you see as the best way to solve this issue ?


Any help is welcome,
Dieter D'haeyere.

_______________________________________________
rules-users mailing list
rules-users at lists.jboss.org<mailto:rules-users at lists.jboss.org>
https://lists.jboss.org/mailman/listinfo/rules-users


_______________________________________________
rules-users mailing list
rules-users at lists.jboss.org<mailto:rules-users at lists.jboss.org>
https://lists.jboss.org/mailman/listinfo/rules-users


________________________________

**************************************************************************************
This message is confidential and intended only for the addressee. If you have received this message in error, please immediately notify the postmaster at nds.com and delete it from your system as well as any copies. The content of e-mails as well as traffic data may be monitored by NDS for employment and security purposes. To protect the environment please do not print this e-mail unless necessary.

NDS Limited. Registered Office: One London Road, Staines, Middlesex, TW18 4EX, United Kingdom. A company registered in England and Wales. Registered no. 3080780. VAT no. GB 603 8808 40-00
**************************************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/rules-users/attachments/20100812/444a44a1/attachment.html 


More information about the rules-users mailing list