Vlad,
It is common to compare SQL Engines and Rules Engines. In a certain
way they do the same thing: select elements matching patterns. But the
algorithms and technologies used are different. In terms of performance,
forward chaining Rules Engines (drools for instance) take a huge benefit
from 2 restrictions that databases don't have.
1. Rule Engines knows all rules (queries) in advance.
You probably is aware of the performance difference when you use a
sequence of adhoc queries to a database when compared to create a
prepared statement and using it to do the same sequence of queries
right? Why does it happen? Because the prepared statement is compiled
(and optimized) in advance! it means it can be so much faster than
compiling each query again and again (or interpreting it) each time an
adhoc query arrives. What happens with rule engines is that they not
only know the rules in advance, but they know ALL rules in advance. So,
when you load (compile) a rulebase with your 400 rules, the engine
optimizes the execution of the whole 400 at once.
It means that if for instance 200 of your rules use Account( number <
9000 ), it will execute the query for these accounts only once and use
the result for the 200 rules (sharing the constraint between your
rules). In the case of a database, you will have each query optimized,
but you will not have all 400 optimized together, means you risk
executing the same constraint 200 times. (we know that cache, indexing
and stuff like that helps, but it is not the same).
2. Forward Chaining Rule Engines usually have all data in memory
There are ways to pull data as needed, but the general use case is
that you already have all your objects for whom you want to apply rules
loaded into memory. It means a much much faster execution (and usually a
much smaller dataset) than what you have in a database. Rule Engines use
this restriction to implement optimizations that a database can't do,
since it usually manages data that does not fit into memory at once.
In this regard, what you usually see is databases and rule engines
working together. Databases take care of data while rule engines take
care of the pattern matching and rules execution.
Having said that, I have no doubt that your 400 rules will run orders
of magnitude faster than running 400 queries at the database. But
remember that each technology will excel in its own field. So, keep the
rules at the engine and the data at the database. :)
Regarding books, there is a good list at the JBoss Rules page:
http://labs.jboss.com/portal/jbossrules/docs
Regarding rules optimization, the same way that happens with SQL,
there are some general advices that will always apply, but the last mile
is always dependent on the product you are using. JBoss Rules
documentation has some tips and people is always helpful here in the
list. But rest assured that if you make a poc comparing manually
implemented rules (being SQL, XPath or any other similar technology)
with a full rules engine implementation... well, there is no doubt what
you will chose for perf.
[]s
Edson
Olenin, Vladimir (MOH) wrote:
Hi, Edson.
Thanks A LOT for the explanations - that significantly cleared things up.
So, would it be a correct thing to say that DROOLS is a complete equevalent
of SQL? How more/less optimized is it in comparison with relational DB
implementation (eg, in the example of embedded select statement you gave)? I
know it might sound like trying to compare apples & oranges, but providing
that I'd have to run
- either 400 similar SQL queries with different combination of field
constraints
- or process 400 rules in the rule engine (the same constrains would
apply)
how that might compare?
Is there any good book on formulating / optimizing rules that you can
recommend?
Thanks.
Vlad
-----Original Message-----
From: rules-users-bounces(a)lists.jboss.org
[mailto:rules-users-bounces@lists.jboss.org] On Behalf Of Edson Tirelli
Sent: 02 February 2007 17:09
To: Rules Users List
Subject: Re: [rules-users] advice is needed: rule based processing ofinter
connected facts
Vlad,
That's what the engine does... it's like SQL. Imagine you have an
"Account" table that has a "number" field. You could write a SQL
like:
select * from account a
where (number % 10) < 5
and number < 9000
and 0 = (select count(*) from account b where b.number =
(a.number+1000) )
I'm writing it from my head, so there may be syntax errors... :) but
I think you got the idea.
You don't say write an algorithm saying:
"for each record in account table..."
The SQL engine iterates the table for you.
The same happens with Rule Engines. In the case of JBoss Rules,
instead of tables, you have classes (Account). Instead of columns, you
have class attributes (number).
If you write a rule like this:
rule "missing accounts"
when
$a : Account( $number : number -> ( $number % 10 < 5 ), number < 9000 )
not Account( number == ( $number + 1000 ) )
then
// $a does not have a matching primary account
end
You are saying the engine to iterate over all Account instances, and
for each of them bind the variable and apply constraints, and when a
full match is found, the consequence is executed.
[]s
Edson
Olenin, Vladimir (MOH) wrote:
>Hi, Edson,
>
>I was going through your rule sniplets and I couldn't understand very well
>one thing:
>
>-------------
>when
> $a : Account( $number : number -> ( number % 10 < 5 ), number < 9000 )
> not Account( number == ( $number + 1000 ) ) then
>-------------
>
>The 'number' variable refers to the 'fact' in the working memory,
correct?
>Basically it means I have only one particular number to compare ALL
>
>
accounts
>(from the data sheet) with?
>
>If so, it's not what I actually need to achieve. I need to compare all
>accounts with 'each other', all of them coming from the same data sheet.
>
>
So,
>I guess it has to be an iteration through all the facts, comparing each
>
>
fact
>with every other one.
>
>Or is the sniplet above does exactly that? Ie, iterating through all the
>facts?
>
>In other words, I'd be initializing working memory ONLY with the facts
>below:
>
> for (Iterator it = accountsFromDataSheet.iterator(); it.hasNext(); ) {
> Account account = (Account)it.next();
> workingMemory.assertObject(account);
> }
>
>After which the rules must operate on the facts loaded...
>
>Thanks.
>
>Vlad
>
>
>-----Original Message-----
>From: rules-users-bounces(a)lists.jboss.org
>[mailto:rules-users-bounces@lists.jboss.org] On Behalf Of Edson Tirelli
>Sent: 02 February 2007 11:13
>To: Rules Users List
>Subject: Re: [rules-users] advice is needed: rule based processing ofinter
>connected facts
>
> Hi Vlad,
>
> This is a case where you can apply business rules with good results.
> In the end, it all depends on how you model your Business Objects,
>but lets get some examples:
>
>
>
>
>
>>1) for all primary accounts 'zxxy' where y < 5, there should be a
matching
>>primary account '(z+1)xxy'
>> - [this one is true for the dataset above]
>>
>>
>>
>>
> My understanding is that you are validating your accounting plan, so
>you may have an Account object in your model. So, if you want to report
>inconsistencies, you can do something like:
>
>rule "missing accounts"
>when
> $a : Account( $number : number -> ( number % 10 < 5 ), number < 9000 )
> not Account( number == ( $number + 1000 ) )
>then
> // $a does not have a matching primary account
>end
>
> Please, note that the "formulas" I used above may not be the best way
>to do it... they are only a possible representation of what you said.
>
>
>
>
>
>>2) sumOfDebit(primary + matching_primary + secondary_account) -
>>sumOfCredit(primary + matching_primary + secondary_account) must be = 0
>> - [this one is also true]
>>
>>
>>
>>
> Here, it seems you are refering to a set of transactions, so you
>might have a set of transaction objects to represent the transaction in
>your sample. So, a possible representation would be:
>
>rule "transaction consistency"
>when
> Transaction( $id : id )
> $credits: Double( )
> from accumulate( TransactionEntry( id == $id, operation ==
>"credit", $amount : amount ),
> init( double balance = 0 ),
> action( balance += $amount ),
> result( new Double( balance ) )
>
>
);
> $debits: Double( )
> from accumulate( TransactionEntry( id == $id, operation ==
>"debit", $amount : amount ),
> init( double balance = 0 ),
> action( balance -= $amount ),
> result( new Double( balance ) )
>
>
);
> eval( ! $credits.equals( $debits ) )
>then
> // inconsistency for transaction $id
>end
>
> Again, this is not the only way or the best way... it is just an
>
>
example.
> Also, for the above examples, I used syntax/features of the jbrules
>3.1 version.
>
> Hope it helps.
>
> []s
> Edson
>
>
>Olenin, Vladimir (MOH) wrote:
>
>
>
>
>
>>Ok, approx data set:
>>
>>Primary Account | Secondary Account | Operation | Amount | Type | Owner
>>------------------------------------------------------------------------
>>0001 | | debit | 100 | A | M
>>1001 | | credit | 80 | A | F
>>1001 | | credit | 20 | X | F
>>0002 | 2002 | debit | 50 | B | M
>>2002 | | dedit | 20 | B | M
>>1002 | | credit | 70 | C | M
>>
>>Rules:
>>
>>1) for all primary accounts 'zxxy' where y < 5, there should be a
matching
>>primary account '(z+1)xxy'
>> - [this one is true for the dataset above]
>>2) sumOfDebit(primary + matching_primary + secondary_account) -
>>sumOfCredit(primary + matching_primary + secondary_account) must be = 0
>> - [this one is also true]
>>3) OwnerOf (primary_account, matching_primary, secondary_account) must be
>>
>>
>>
>>
>of
>
>
>
>
>>the same gender
>> - [this one is false - 0001 owner must be 'F']
>>
>>.... The kind of the rules above... The dataset is more complex and the
>>rules are a bit more involved, but this should give an idea.
>>
>>Thanks for all suggestions!
>>
>>Vlad
>>
>>-----Original Message-----
>>From: rules-users-bounces(a)lists.jboss.org
>>[mailto:rules-users-bounces@lists.jboss.org] On Behalf Of Michael Rhoden
>>Sent: 01 February 2007 17:49
>>To: 'Rules Users List'
>>Subject: RE: [rules-users] advice is needed: rule based processing
>>ofinterconnected facts
>>
>>Can you post a couple of example conditions with a dataset you want to
>>check?
>>
>>-Michael
>>
>>-----Original Message-----
>>From: rules-users-bounces(a)lists.jboss.org
>>[mailto:rules-users-bounces@lists.jboss.org] On Behalf Of Olenin, Vladimir
>>(MOH)
>>Sent: Thursday, February 01, 2007 4:04 PM
>>To: rules-users(a)lists.jboss.org
>>Subject: [rules-users] advice is needed: rule based processing
>>ofinterconnected facts
>>
>>Hi,
>>
>>
>>
>>I need some pointer as to where to start with the problem below.
>>
>>
>>
>>The system should be able to validate the balancing data based on around
>>
>>
>>
>>
>400
>
>
>
>
>>different rules. To simplify the task, the data facts are essentially the
>>debit/credit transactions on different accounts. The rules describe the
>>correlation between different facts:
>>
>>- eg, all debit transactions minus all credit transaction must be
>>equal 0
>>
>>- if one account got credited, there should be another account
>>(within the same dataset) which was debited
>>
>>- if there are accounts starting with some letter combination,
>>there also should be
>>
>>- etc
>>
>>
>>
>>In other words, each rule describes
>>
>>- the subset of facts to be analyzed
>>
>>- the rules to be checked against this subset
>>
>>
>>
>>It seems basically like each fact is a set of Account Number, Transaction
>>Type, Transaction Amount information, Secondary Account Number (which
>>sometimes needs to be validated against some other account number within
>>
>>
>>
>>
>the
>
>
>
>
>>same data set). But I couldn't find a way to relate between multiple data
>>facts.
>>
>>
>>
>>On one hand rule engine seems to be a good solution in here, but I'm not
>>sure how to deal with 'dynamic selection' of the subset of facts. Can
this
>>kind of task be reformulated somehow?
>>
>>
>>
>>Any pointers into the DROOLS documentation or hints on a general approach
>>would be greatly appreciated!
>>
>>
>>
>>Thanks.
>>
>>
>>
>>Vlad
>>
>>_______________________________________________
>>rules-users mailing list
>>rules-users(a)lists.jboss.org
>>https://lists.jboss.org/mailman/listinfo/rules-users
>>
>>
>>
>>_______________________________________________
>>rules-users mailing list
>>rules-users(a)lists.jboss.org
>>https://lists.jboss.org/mailman/listinfo/rules-users
>>_______________________________________________
>>rules-users mailing list
>>rules-users(a)lists.jboss.org
>>https://lists.jboss.org/mailman/listinfo/rules-users
>>
>>
>>
>>
>>
>>
>>
>
>
>
>
--
Edson Tirelli
Software Engineer - JBoss Rules Core Developer
Office: +55 11 3124-6000
Mobile: +55 11 9218-4151
JBoss, a division of Red Hat @
www.jboss.com