How to improve performance with large number of rules and facts ?
by virajn
Hi alll,
In my project i'm loading 18,000 rules ( 3mb file ) to reason with 11,000
facts. I recorded following time durations.
- Rule loading time - 3 min 6 seconds
- Time duration to create stateful knowledge session - 5 min 52 seconds
- Fact inserting time - 5 min 31 seonds
- Time duration to fire all rules - 8 seconds
I want to reduce these times. I'm new to drools and using standard drools
rule evaluation codes ( using insert in a loop to insert facts, creating
knowledge session and call fireAllRules() ). Is there a way ( ex: batch
operation etc ) to improve performance in any area specified above ? Any
direction will be helpful. ( In my project requirement does not allows to
use stateless session )
Thanks !
--
View this message in context: http://drools.46999.n3.nabble.com/How-to-improve-performance-with-large-n...
Sent from the Drools: User forum mailing list archive at Nabble.com.
12 years, 9 months
Rule set execution performance and memory consumption issues
by Joe Ammann
Hi all (sorry for the lenghty post, I'm trying to describe a basic pattern)
I have a set of similar rules which work ok, but due to increasing
number of facts are starting to create performance headaches. They all
share a basic pattern, and I wanted to ask if you think that my approach
is feasible - or if I should choose a totally different approach.
The basic pattern is that I have a set of facts of one type, where every
fact "owns" a collection of facts of another type. The association is
not a simple one, rather it has some attributes on its own. The ruleset
is triggered when a new file delivery from a financial data provider
arrives, and we want to check the quality of the new data.
Examples:
FinancialInstrument -> Recommendation/Rating (where the recommendation
has a date, a rating code, etc.)
InstrumentList -> FinancialInstrumentAssignment (where the assignment
has a priority, a validFrom date, etc.)
So we have existing assignments in our database (say 100
InstrumentLists, 20'000 Instruments, 200'000 individual list
assignments), and a new file arrives. The data in this file is converted
and loaded into staging tables, and ends up in special Java objects we
call "SourcingXXX". So a "Instrument" always has a corresponding
"SourcingInstrument" class.
The rules must validate the incoming data and modify (insert, update,
delete) the existing objects. My current basic rule pattern is that for
every such "association", I have 3 or more rules like:
# the mappings that already exist
rule "R1110: MatchExistingAssignment"
when
sa : SourcingAssignment( )
a : Assignment( instrumentId == sa.instrumentId,
instrumentListId == sa.instrumentListId )
then
# apply the logic
retract(a);
retract(sa);
end
# new mappings must be created
rule "R1111: MatchNewAssignment"
when
sa : SourcingAssignment( )
not a : Assignment( instrumentId == sa.instrumentId,
instrumentListId == sa.instrumentListId )
il : InstrumentList( id == sa.instrumentListId )
then
# create new assignment
Assignment a = ...
retract(sa);
end
# missing mappings must be removed
# execute late when the above have had a chance
rule "R1112: RemoveAssignment"
salience -100
when
a : Assignment( )
then
# remove the assignment
retract(a);
end
I shortened these rules to emphasize the basic pattern (hopefully no
typos made..). Normally the rules have more conditions than just the
mapping conditions shown here, and typically there is more than one rule
for the "new case" and the "already exists case".
Typical fact numbers are
- a few 100-1000 of the "owning" object (InstrumentList above)
- a few 10'000 of the "owned" object (Instrument above)
- a few 100'000 of the "assignment" object (Assignment above)
I'm starting to see problematic runtimes (~20 minutes) and memory
consumption (1-2 GB), of course mainly with the "assignment object"
rules, where 100'000 Sourcing objects are matched against 100'000 or so
existing objects. Most of that is of course used during the insertion of
the facts, the fireAllRules() is fast (1-2 minutes).
Is my approach totally flawed? Is there a fundamentally different
approach? I see that some simple cases could be done without Drools in
SQL or Java, but IMHO the more complex cases result in unmanageable
Java/SQL code (we already have one example of a 200 line SQL statement
that nobody dares to touch...)
--
CU, Joe
12 years, 9 months
Jbpm-standalone-designer with postgres
by Naman Shah
Hi,
I need to use the standalone jbpm-desginer-standalone with postgresql.
I check the code and read some where that i need to use the Repository for
further modification.
Is there any documentation out there or some example on this, so that i can
use it for reference.
Thanks
Naman Shah
12 years, 9 months