[rules-users] ambition = ThreadPoolExecutor delegating to KBPool(s) & KSPools(s)

Stephen Masters stephen.masters at me.com
Thu Feb 7 14:17:32 EST 2013


We all know what they say about premature optimisation, so maybe it would be worth stating your performance goals? It might help with establishing which direction the optimisations should go. And also whether such optimisations are even worth trying to achieve.

In a knowledge base with a reasonable number of rules, and especially if a number of facts need to be derived on start-up, the technique Wolfgang mentions with a long-running session is what I have found tends to give me the best performance. i.e. For each incoming request: insert, fire, retract, fire. Obviously it depends on your rules, but sub-millisecond evaluations are quite achieveable (~20 microseconds is the fastest I have achieved so far). Which takes me back to the question of what throughput do you need to achieve? And how bursty are you expecting it to be?

Of course, if there is no dependence between evaluations (i.e. you're not accumulating transaction values or anything like that) it's pretty simple to set up a queue of requests and a pool of stateful sessions, so it's not that much wasted effort if you don't need it.

Steve



On 7 Feb 2013, at 18:05, "Cotton, Ben" <Ben.Cotton at morganstanley.com> wrote:

> 
> Ø Maybe you just need a StatelessKnowledgeSession (you're using this term in a later mail), that, when used in *sequential mode* is faster.
>  
> But the fear we have here is that the backing queue of simultaneously arriving facts could back up if we process in synchronous sequence.  We are willing to trade off an individual fact being processed more slowly, if we can simultaneously process many facts.
> 
> As you point off, the pattern that wins the “sequential v. parallel” bake-off is TBD.
>  
>  
> -----Original Message-----
> From: rules-users-bounces at lists.jboss.org [mailto:rules-users-bounces at lists.jboss.org] On Behalf Of Wolfgang Laun
> Sent: Thursday, February 07, 2013 12:50 PM
> To: Rules Users List
> Subject: Re: [rules-users] ambition = ThreadPoolExecutor delegating to KBPool(s) & KSPools(s)
>  
> KnowledgeSessions are created from a KnowledgeBase, and they are independent from each other, even when they come from a single KnowledgeBase. So I don't see the need for 24 KBases.
>  
> You've referred to "fireAllRules()", which isn't available with a StatelessKnowledgeSession. Maybe you just need a StatelessKnowledgeSession (you're using this term in a later mail), that, when used in *sequential mode* is faster. Otherwise, disposing a StatelessKnowledgeSession after processing each fact may not be faster than simply retracting the fact from a StatefulKnowledgeSession, which would prepare it for the next task, and not cause everything to go through GC.
>  
> Just benchmarking will tell you what is best for your scenario.
>  
> -W
>  
>  
> On 07/02/2013, Cotton, Ben <Ben.Cotton at morganstanley.com> wrote:
> > Thanks Jeremy.  Just finished watching your referenced video  "Drools
> > & Large Data Sets Workshop" - no doubt about it, people are explicitly
> > using this pattern (Pool,Queue,Delegate,Callback) w/in Drools to
> > achieve higher concurrent "simultaneous fact arrival" transaction throughput, and scale.
> > Especially appreciate your comment re: managing Runnables' callbacks
> > w/in the proposed framework... we indeed have to be careful here, and
> > will re-factor or design accordingly.
> > 
> > From: rules-users-bounces at lists.jboss.org
> > [mailto:rules-users-bounces at lists.jboss.org] On Behalf Of Jeremy Ary
> > Sent: Thursday, February 07, 2013 12:08 PM
> > To: Rules Users List
> > Subject: Re: [rules-users] ambition = ThreadPoolExecutor delegating to
> > KBPool(s) & KSPools(s)
> > 
> > Yep, that all makes sense for the more elaborated context. Sounds like
> > you're working with a model wherein you needn't concern yourself with
> > relational logic between instances, so I think the value of splitting
> > sessions over threads with a multi-consumer queueing setup could allow
> > you the opportunity to async your process with better throughput. What
> > you and I propose differs only in high-availability and scalability of
> > the input stream (potentially arising from throttling to a single
> > instance responsible for maintaining task scheduling and executor
> > lifecycle), offering an ability to recover should you lose your
> > application containing pooled tasks and allow for smaller pool sizes
> > to maintain (pull only as needed/desired from queueing and adjust that
> > capacity on the fly for high-usage times such as first thing in the morning).
> > 
> > Semi-related I just found a video about a large-scale operation that
> > Alexandre Porcelli created that might be of some interest to you.
> > http://vimeo.com/27209589
> > 
> > My only other thoughts going into it is consider some different
> > approaches for the scheduling mechanism given that, as I've had the
> > unpleasure of discovering before, callbacks from runnables can be fun
> > to keep up with of you're dependent on them, so fire-and-forget vs.
> > results synchronicity can make a difference in the mechanism you
> > choose to maintain your tasks and pools.
> > 
> > Regards,
> > Jeremy
> > 
> > On Thu, Feb 7, 2013 at 10:47 AM, Cotton, Ben
> > <Ben.Cotton at morganstanley.com<mailto:Ben.Cotton at morganstanley.com>> wrote:
> > Thanks for your response, Ary.
> > 
> > It is much more about accommodating high-frequency and throughput. 
> > The rules  are ZERO sensitive to time and order - they are rendered 1x
> > at start of day.  They are exceedingly complicated, and there are lots of them ...
> > but once they are bound to a KB nothing changes about them for the whole
> > day.   When we put a fact on a KS.fireAllRules() task the rendered decision
> > is idempotent wrt to rules' firing(s) order.
> > 
> > Also, all arriving facts are immutable and all sessions are stateless,
> > so we kind of have ignored CEP (seeing it as more appropriate for a
> > long-living ecosystem of continuously mutating facts).
> > 
> > Effectively, we want a "small, simple, safe, speedy" body of
> > operations on "complex, cumbersome, concurrently-arriving, constant" facts.
> > 
> > 
> > 
> > From:
> > rules-users-bounces at lists.jboss.org<mailto:rules-users-bounces at lists.j
> > boss.org>
> > [mailto:rules-users-bounces at lists.jboss.org<mailto:rules-users-bounces
> > @lists.jboss.org>]
> > On Behalf Of Jeremy Ary
> > Sent: Thursday, February 07, 2013 11:32 AM
> > To: Rules Users List
> > Subject: Re: [rules-users] ambition = ThreadPoolExecutor delegating to
> > KBPool(s) & KSPools(s)
> > 
> > Are you in a place where your rules have become sensitive to time and order?
> > If so, have you considered CEP? If it's less about that and more about
> > getting the work done ASAP, you could also investigate a messaging
> > integration pattern to assist with all the pooling/throttling/queueing
> > needs you've mentioned.
> > 
> > On Thu, Feb 7, 2013 at 10:04 AM, Cotton, Ben
> > <Ben.Cotton at morganstanley.com<mailto:Ben.Cotton at morganstanley.com>> wrote:
> > Let's say that a start-of-day, every day, we generate a giant 2,000+ 
> > rule .DRL, that we then use to construct into a single run-time
> > KnowledgeBase reference.  We then construct a single run-time
> > KnowledgeSession reference (also at start of day).  Throughout the day, all day, facts "arrive"
> > asynchronously into our expert system.  When a fact "arrives", we
> > synchronously place the fact onto our single KS and call
> > .fireAllRules(), which in turn synchronously outputs answers that
> > satisfy our "what's the next step?" decision requirements.
> > 
> > We have this working very well, but we have the ambition to achieve more.
> > 
> > We want  to attempt to scale this solution to accommodate the
> > high-frequency simultaneous "arrival" of many facts.  We have at our
> > disposal a 24xCPU 128 gb Linux-based compute resource (nice, right?)
> > ... so, ideally, we have the ambition to potentially accommodate the
> > simultaneous arrival of 24 facts into our expert system.
> > 
> > Assuming that all of our 2,000+ rules are completely isolated (i.e. no
> > rule i ever depends on any rule j, for all i,j) we want to consider
> > building (at start of day) a KSPool (size 24) , KBPool (size 24), and a
> > ThreadPoolExecutor (size 24, backed by BlockingQueue).   As facts arrive
> > throughout the day, those that arrive simultaneously are Queue'd to
> > the TPE, that then delegates the fact's need for service to a task
> > Runnable,  which in turn calls a KSPool[i].fireAllRules() (with
> > isolation to KBPool[i]).  In such a scheme, we would potentially be
> > able to render decisions concurrently when facts arrive simultaneously ( capacity 24).
> > 
> > Is this design ambition common w/in current DROOLs use cases?  Does
> > the current (or future) DROOLS offering include any in-place
> > capability to Pool KS or Pool KB?  If not, are there any potential DROOLs concerns or "gotchas"
> > wrt to our pursuing this ambition (in a "let's build this now!" prototype)?
> > 
> > As always, tremendous thanks to all in this community forum.
> > 
> > 
> > Ben D Cotton III
> > Morgan Stanley & Co.
> > OTC Derivatives Clearing Technology
> > 1221 AOTA Rockefeller Ctr - Flr 27
> > New York, NY 10020
> > (212)762.9094<tel:%28212%29762.9094>
> > ben.cotton at ms.com<mailto:ben.cotton at ms.com>
> > 
> > 
> > 
> > ________________________________
> > 
> > NOTICE: Morgan Stanley is not acting as a municipal advisor and the
> > opinions or views contained herein are not intended to be, and do not
> > constitute, advice within the meaning of Section 975 of the Dodd-Frank
> > Wall Street Reform and Consumer Protection Act. If you have received
> > this communication in error, please destroy all electronic and paper
> > copies and notify the sender immediately. Mistransmission is not
> > intended to waive confidentiality or privilege. Morgan Stanley
> > reserves the right, to the extent permitted under applicable law, to
> > monitor electronic communications. This message is subject to terms available at the following link:
> > http://www.morganstanley.com/disclaimers If you cannot access these
> > links, please notify us by reply message and we will send the contents
> > to you. By messaging with Morgan Stanley you consent to the foregoing.
> > 
> > _______________________________________________
> > rules-users mailing list
> > rules-users at lists.jboss.org<mailto:rules-users at lists.jboss.org>
> > https://lists.jboss.org/mailman/listinfo/rules-users
> > 
> > 
> > ________________________________
> > 
> > NOTICE: Morgan Stanley is not acting as a municipal advisor and the
> > opinions or views contained herein are not intended to be, and do not
> > constitute, advice within the meaning of Section 975 of the Dodd-Frank
> > Wall Street Reform and Consumer Protection Act. If you have received
> > this communication in error, please destroy all electronic and paper
> > copies and notify the sender immediately. Mistransmission is not
> > intended to waive confidentiality or privilege. Morgan Stanley
> > reserves the right, to the extent permitted under applicable law, to
> > monitor electronic communications. This message is subject to terms available at the following link:
> > http://www.morganstanley.com/disclaimers If you cannot access these
> > links, please notify us by reply message and we will send the contents
> > to you. By messaging with Morgan Stanley you consent to the foregoing.
> > 
> > _______________________________________________
> > rules-users mailing list
> > rules-users at lists.jboss.org<mailto:rules-users at lists.jboss.org>
> > https://lists.jboss.org/mailman/listinfo/rules-users
> > 
> > 
> > 
> > ________________________________
> > 
> > NOTICE: Morgan Stanley is not acting as a municipal advisor and the
> > opinions or views contained herein are not intended to be, and do not
> > constitute, advice within the meaning of Section 975 of the Dodd-Frank
> > Wall Street Reform and Consumer Protection Act. If you have received
> > this communication in error, please destroy all electronic and paper
> > copies and notify the sender immediately. Mistransmission is not
> > intended to waive confidentiality or privilege. Morgan Stanley
> > reserves the right, to the extent permitted under applicable law, to
> > monitor electronic communications. This message is subject to terms available at the following link:
> > http://www.morganstanley.com/disclaimers If you cannot access these
> > links, please notify us by reply message and we will send the contents
> > to you. By messaging with Morgan Stanley you consent to the foregoing.
> > 
> _______________________________________________
> rules-users mailing list
> rules-users at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/rules-users
> 
> 
> 
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.
> 
> 
> 
> _______________________________________________
> rules-users mailing list
> rules-users at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/rules-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/rules-users/attachments/20130207/cba52295/attachment-0001.html 


More information about the rules-users mailing list