<html>

  <head>

    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    As Mark said, stateless VS stateful is not really relevant for

    speed. In fact, stateless session is a a wrapper for a stateful

    session (except for sequential mode). Stateless session will save

    you from writing explicit fact insertion (3 or 4 line of codes) and

    dispose your session at end (1 line of code). For your problem, I

    recommend using stateful, as you will have to tune your fact

    insertion ... The only thing not to forget is to dispose your

    session at the end.<br>

    <br>

    Cutting your data in groups will result in using multiples ksession

    (ie different working memories), for state or stateful ... And this

    is the point : having multiple small ksessions is better than having

    a very big ksession, especially if you have a big number of joins.

    Creating new ksessions is quick, as soon as you have a shared

    rulebase (all new ksession are created by the shared rulebase,

    compiled once with all your rules. It is compilation which is time

    consuming, not WM creation). And with multiple ksession, you can do

    the job in parrallel on a machine cluster, so it becomes easy to

    enter in any deadline : just use more machines !!<br>

    <br>

    <br>

    Le 20/12/2011 15:19, Zhuo Li a écrit :

    <blockquote cite="mid:009201ccbf22$6ba4af90$42ee0eb0$@com"

      type="cite">

      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

      <meta name="Generator" content="Microsoft Word 12 (filtered

        medium)">

      <style>

<!--

 /* Font Definitions */

 @font-face

        {font-family:SimSun;

        panose-1:2 1 6 0 3 1 1 1 1 1;}

@font-face

        {font-family:"Cambria Math";

        panose-1:2 4 5 3 5 4 6 3 2 4;}

@font-face

        {font-family:Calibri;

        panose-1:2 15 5 2 2 2 4 3 2 4;}

@font-face

        {font-family:SimSun;

        panose-1:2 1 6 0 3 1 1 1 1 1;}

 /* Style Definitions */

 p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin:0cm;

        margin-bottom:.0001pt;

        text-align:justify;

        text-justify:inter-ideograph;

        font-size:10.5pt;

        font-family:"Calibri","sans-serif";

        color:black;}

a:link, span.MsoHyperlink

        {mso-style-priority:99;

        color:blue;

        text-decoration:underline;}

a:visited, span.MsoHyperlinkFollowed

        {mso-style-priority:99;

        color:purple;

        text-decoration:underline;}

p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph

        {mso-style-priority:34;

        margin:0cm;

        margin-bottom:.0001pt;

        text-align:justify;

        text-justify:inter-ideograph;

        text-indent:21.0pt;

        font-size:10.5pt;

        font-family:"Calibri","sans-serif";

        color:black;}

span.EmailStyle18

        {mso-style-type:personal;

        font-family:"Calibri","sans-serif";

        color:windowtext;}

span.EmailStyle19

        {mso-style-type:personal-reply;

        font-family:"Calibri","sans-serif";

        color:#1F497D;}

.MsoChpDefault

        {mso-style-type:export-only;

        font-size:10.0pt;}

@page WordSection1

        {size:612.0pt 792.0pt;

        margin:72.0pt 90.0pt 72.0pt 90.0pt;}

div.WordSection1

        {page:WordSection1;}

 /* List Definitions */

 @list l0

        {mso-list-id:547034152;

        mso-list-type:hybrid;

        mso-list-template-ids:1018445210 -909993680 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}

@list l0:level1

        {mso-level-tab-stop:none;

        mso-level-number-position:left;

        margin-left:18.0pt;

        text-indent:-18.0pt;}

@list l0:level2

        {mso-level-tab-stop:72.0pt;

        mso-level-number-position:left;

        text-indent:-18.0pt;}

@list l0:level3

        {mso-level-tab-stop:108.0pt;

        mso-level-number-position:left;

        text-indent:-18.0pt;}

@list l0:level4

        {mso-level-tab-stop:144.0pt;

        mso-level-number-position:left;

        text-indent:-18.0pt;}

@list l0:level5

        {mso-level-tab-stop:180.0pt;

        mso-level-number-position:left;

        text-indent:-18.0pt;}

@list l0:level6

        {mso-level-tab-stop:216.0pt;

        mso-level-number-position:left;

        text-indent:-18.0pt;}

@list l0:level7

        {mso-level-tab-stop:252.0pt;

        mso-level-number-position:left;

        text-indent:-18.0pt;}

@list l0:level8

        {mso-level-tab-stop:288.0pt;

        mso-level-number-position:left;

        text-indent:-18.0pt;}

@list l0:level9

        {mso-level-tab-stop:324.0pt;

        mso-level-number-position:left;

        text-indent:-18.0pt;}

ol

        {margin-bottom:0cm;}

ul

        {margin-bottom:0cm;}

-->

</style><!--[if gte mso 9]><xml>

 <o:shapedefaults v:ext="edit" spidmax="1026" />

</xml><![endif]--><!--[if gte mso 9]><xml>

 <o:shapelayout v:ext="edit">

  <o:idmap v:ext="edit" data="1" />

 </o:shapelayout></xml><![endif]-->

      <div class="WordSection1">

        <p class="MsoNormal"><span style="color:#1F497D" lang="EN-US">Thanks

            Vincent. Slicing

            the data is definitely an option but I guess stateless

            session is a better fit

            for this solution? Stateful session means you will have to

            keep your slices

            into different working memory slots which will not be

            efficient intuitionally. <o:p></o:p></span></p>

        <p class="MsoNormal"><span style="color:#1F497D" lang="EN-US"><o:p> </o:p></span></p>

        <p class="MsoNormal"><span style="color:#1F497D" lang="EN-US">Best<o:p></o:p></span></p>

        <p class="MsoNormal"><span style="color:#1F497D" lang="EN-US">Abe<o:p></o:p></span></p>

        <p class="MsoNormal"><span style="color:#1F497D" lang="EN-US"><o:p> </o:p></span></p>

        <div>

          <div style="border:none;border-top:solid #B5C4DF

            1.0pt;padding:3.0pt 0cm 0cm 0cm">

            <p class="MsoNormal" style="text-align:left" align="left"><b><span

style="font-size:10.0pt;font-family:SimSun;color:windowtext">发件人<span

                    lang="EN-US">:</span></span></b><span

                style="font-size:10.0pt;

                font-family:SimSun;color:windowtext" lang="EN-US">

                <a class="moz-txt-link-abbreviated" href="mailto:rules-users-bounces@lists.jboss.org">rules-users-bounces@lists.jboss.org</a>

                [<a class="moz-txt-link-freetext" href="mailto:rules-users-bounces@lists.jboss.org">mailto:rules-users-bounces@lists.jboss.org</a>] </span><b><span

                  style="font-size:

                  10.0pt;font-family:SimSun;color:windowtext">代表 </span></b><span

style="font-size:10.0pt;font-family:SimSun;color:windowtext"

                lang="EN-US">Vincent

                Legendre<br>

              </span><b><span

                  style="font-size:10.0pt;font-family:SimSun;color:windowtext">发

                  送时间<span lang="EN-US">:</span></span></b><span

                style="font-size:10.0pt;

                font-family:SimSun;color:windowtext" lang="EN-US"> 2011</span><span

                style="font-size:10.0pt;

                font-family:SimSun;color:windowtext">年<span lang="EN-US">12</span>月<span

                  lang="EN-US">20</span>日<span lang="EN-US"> 22:01<br>

                </span><b>收件人<span lang="EN-US">:</span></b><span

                  lang="EN-US"> Rules Users List<br>

                </span><b>主题<span lang="EN-US">:</span></b><span

                  lang="EN-US"> Re:

                  [rules-users] Working memory batch insert performance<o:p></o:p></span></span></p>

          </div>

        </div>

        <p class="MsoNormal" style="text-align:left" align="left"><span

            lang="EN-US"><o:p> </o:p></span></p>

        <p class="MsoNormal"><span lang="EN-US">There is a recent post

            on "poor

            performance from a simple join" that highligths almost the

            same questions

            : because "insert" trigger RETE propagation, time to insert

            depends

            on rules complexity.<br>

            <br>

            May be you can start by looking at your rules to optimise

            them (see the

            previous post for some tips).<br>

            <br>

            If it is still too long, may be you can cut your data in

            smaller groups. The

            main problem here is to be able to cut the data into

            pertinent groups according

            to rules (problems can happend if you have some accumulates,

            or exists, or not

            .... If you only have simple filters, you can cut your data

            where your want. If

            your rules are reasonning with global existence or lack for

            a fact, then you

            must ensure that, for example, a "not MyFact()" is true

            because the

            fact does not exists at all, and not only because it is not

            part of the chunk

            ...).<br>

            <br>

            <br>

            Le 20/12/2011 14:27, Mark Proctor a écrit : <o:p></o:p></span></p>

        <p class="MsoNormal"><span lang="EN-US">On 20/12/2011 13:09,

            Zhuo Li wrote: <o:p></o:p></span></p>

        <p class="MsoNormal"><span lang="EN-US">Hi, folks,<o:p></o:p></span></p>

        <p class="MsoNormal"><span lang="EN-US"> <o:p></o:p></span></p>

        <p class="MsoNormal"><span lang="EN-US">I recently did a

            benchmark on Drools 5.1.2

            and noticed that data insert into a stateful session is very

            time consuming. It

            took me about 30 minutes to insert 10,000 data rows on a

            512M heapsize JVM.

            Hence I have to keep inserting data rows when I receive them

            and keep them in

            working memory, rather than loading them in a batch at a

            given time. This is

            not a friendly way for disaster recovery and I have two

            questions here to see

            if anybody has any thoughts:<o:p></o:p></span></p>

        <p class="MsoNormal" style="text-align:left" align="left"><span

            style="font-size:12.0pt;font-family:&quot;Times New

            Roman&quot;,&quot;serif&quot;" lang="EN-US">10K rows? is

            that 10K bean insertions? 30 minutes sounds bad. We know

            people doing far more

            than that much quicker.<br>

            <br>

            <o:p></o:p></span></p>

        <p class="MsoNormal"><span lang="EN-US"> <o:p></o:p></span></p>

        <p class="MsoListParagraph"

          style="margin-left:18.0pt;text-indent:-18.0pt"><span

            lang="EN-US">Is there any better way to improve the

            performance of data insert

            into a stateful session;<o:p></o:p></span></p>

        <p class="MsoNormal" style="text-align:left" align="left"><span

            style="font-size:12.0pt;font-family:&quot;Times New

            Roman&quot;,&quot;serif&quot;" lang="EN-US">There is nothing

            faster than "insert".<br>

            If you don't need inference, you can try turning on

            "sequential"

            mode, but in general the performance gain is &lt; 5%.<br>

            <br>

            <o:p></o:p></span></p>

        <p class="MsoListParagraph"

          style="margin-left:18.0pt;text-indent:-18.0pt"><span

            lang="EN-US">I noticed that there is a method called

            BatchExecution() for a

            stateless session. Did not get a chance to test it yet but

            is this a better way

            to load data in a batch and then run rules?<o:p></o:p></span></p>

        <p class="MsoNormal" style="text-align:left" align="left"><span

            style="font-size:12.0pt;font-family:&quot;Times New

            Roman&quot;,&quot;serif&quot;" lang="EN-US">That is related

            to scripting an engine, it uses command objects to call the

            inert() method - so

            definitely not faster.<br>

            <br>

            <o:p></o:p></span></p>

        <p class="MsoNormal"><span lang="EN-US"> <o:p></o:p></span></p>

        <p class="MsoNormal"><span lang="EN-US">My requirement is I need

            to load a batch of

            data once by end of the day, and then run the rules to

            filter out matched data

            against unmatched data. I have a 3-hour processing window to

            complete this

            loading and matching process, and the data I need to load is

            about 1 million to

            2 millions. My JVM heapsize can be set up to 1024 M.<o:p></o:p></span></p>

        <p class="MsoNormal"><span lang="EN-US"> <o:p></o:p></span></p>

        <p class="MsoNormal"><span lang="EN-US">Best regards<o:p></o:p></span></p>

        <p class="MsoNormal"><span lang="EN-US">Abe<o:p></o:p></span></p>

        <p class="MsoNormal"><span lang="EN-US"> <o:p></o:p></span></p>

        <p class="MsoNormal" style="text-align:left" align="left"><span

            style="font-size:12.0pt;font-family:&quot;Times New

            Roman&quot;,&quot;serif&quot;" lang="EN-US"><o:p> </o:p></span></p>

      </div>

    </blockquote>

    <br>

  </body>

</html>