A few general questions on scaling StatefulKnowledgeSessions

Thursday, 16 August 2012

Greetings All!

I humbly request your guidance and insights!

Overview:
We are currently undergoing evaluation of how to best proceed using the
Drools Suite to best meet the current and future business needs with the
highest system scalability and performance. We are attempting to make the
proper system design choices, particularly with respect to which of the two
KnowledgeSession types (stateful or stateless) to use and how to best use
them to scale the system. 

The Context: We are using Drools for operational decision making,
monitoring, and workforce resource management; this naturally entails some
degree of event processing, temporal reasoning, state management, and
inference. 
Given the nature of this context, it seems that a StatefulKnowledgeSession
is justified and best (but may not be entirely necessary). 

The current approach: 
Currently, our rule model is not very mature or stable... Consequently, the
approach is to use one very large long-running StatefulKnowledgeSession
containing all relevant operational data. This single
StatefulKnowledgeSession will be constructed and disposed of (and
reconstructed with operational state) on a very infrequent interval, say
every 24 hours. In this fashion, a single working memory network manages the
entire operational state and holds all relevant facts; Each fact is updated
on a per-event basis.

The problem:
This approach has many drawbacks in my opinion... I'll mention just a few...
StatefulSessions are not thread-safe (require sequential processing) and
consequently will not scale; it is also a single point of failure. Also, as
the size of Working Memory grows, processing time increases and garbage
collection becomes very messy and laggy (when performed).

The potential solution:
To enable greater scalability, differentiation, and parallelization, it
seems wise to partition the rules/facts into multiple specialized
concurrently operating StatefulKnowledgeSession (KnowledgeBase) instances.
However, if done improperly, this poses a difficult problem as with greater
separation, the more myopic our reasoning becomes.

Questions (some of which are intentionally dumb):

Given the nature of the above mentioned business context, does a Stateful
approach seem justified (or is it advised to follow KISS and remain
stateless)? 

What are the recommended strategies to best scale StatefulKnowledgeSessions
(or a set of StatefulKnowledgeSessions) as they are inherently
single-threaded? 

If multiple StatefulKnowledgeSessions/KnowledgeBases are used, what are the
recommended strategies to partition/individualize/classify them? Should we
do this according to type, instance value, unique identifier, group of
interrelated objects, etc? I understand this is very domain/use-case
specific, but I'm curious how others approach this matter. 

What are the recommended strategies with respect to frequency (or triggers)
of disposal of a StatefulKnowledgeSession and/or the retraction of the facts
therein?

What would you say a healthy average working-memory size is?

What would you say the average lifespan/duration of a
StatefulKnowledgeSession is? 

If we have an external data store (which holds state) from which we can
query and reconstruct working-memory state for any given object (or set of
objects), would it be best to continue with the single large
StatefulKnowledgeSession approach? 

Also, If we can always reconstruct state, is there any material difference
(capability-wise) between a Stateful and Stateless KnowledgeSession besides
inference/iterative decision making? Is it possible to do Temporal
Reasoning/CEP with a StatelessKnowledgeSession (I think not)?

I entirely understand that most of this is very context specific, and that
no one else can solution this on my behalf; Some of these questions may be
very obtuse... However, I'd sincerely appreciate any insights from this
righteously authoritative community. 

With Humility and Gratitude, 
Skiddlebop

--
View this message in context:
http://drools.46999.n3.nabble.com/A-few-general-questions-on-scaling-Stat...
Sent from the Drools: User forum mailing list archive at Nabble.com.

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006