[rules-users] architectural question: accessing large dataset from rdbms?

Tue Aug 7 18:12:44 EDT 2007

I've had ideas on how to do this, but no timelines or concrete plans. 
The basic idea is as rules are evaluated, based on the constraints pulls 
in the data from the database - this is basically about building special 
join nodes. A caching system is used to cache by data partition 
segments, with a time based eviction system, to avoid continous db hits. 
This way people could declare their rules, but never need to insert 
data, they just assign it a db source. Patches welcome :)

Mark
Scott Finnie wrote:
> Hi,
>
> As a drools newbie, I'm grappling with the above question; any help 
> much appreciated.  To elaborate:
>
>  - We have a largish dataset (~50GB+) stored in an rdbms (oracle).
>  - we're considering using drools to implement business rules (e.g. 
> for data validation constraints and derivations).
>  - the issue is how to give the rulebase efficient & scalable access 
> to the db.  It would potentially need to access the whole dataset, 
> since business rules can potentially affect all tables.
>
> We've done an artificial pilot with a much smaller dataset, simply by 
> syncing the entire db into the rulebase.  Users like it because they 
> can read the rules directly (using a DSL).  Before we go any further 
> however we need to find a scaling strategy before we go any further.
>
> We were thinking about some kind of caching strategy: conceptually the 
> rulebase would have a cache hit failure causing data to be loaded from 
> the db.  However we've no idea if that's a practical option, or if 
> there's something better.
>
> Hope that makes sense; any help much appreciated.  Oh, and btw, thanks 
> for a great piece of software!
>
>  - Scoot.
> _______________________________________________
> rules-users mailing list
> rules-users at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/rules-users
>