<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#ffffff" text="#000000">
    <br>
    Hi Wolfgang,<br>
    <br>
    I finally decided to test different implementations:<br>
    <ul>
      <li>first based on an accumulation function</li>
      <li>second (your suggestion) relying on drools to 1) build all
        SentenceWindows then to 2) locate ManualAnnotations inside those
        Windows</li>
      <li>third (your suggestion as well) relying on drools to 1) build
        only SentenceWindows that might be interesting (containing one
        of the ManualAnnotations I am looking for) then to 2) locate
        ManualAnnotations inside those Windows</li>
    </ul>
    First implementation performs quite well but I am stuck on the
    parametrization (I need to to define build2windows, build3windows,
    build4windows... functions): 47 milliseconds on 100 sentences, 94 ms
    sentences with 1000 sentences<br>
    Second implementations is of course sub optimal since it creates
    many useless windows: 125 ms on 100 sentences, 14400 ms on 1000
    sentences<br>
    Third implementation is very versatile and its performances are
    comparable to the accumulator solution: 93 ms on 100 sentences, 125
    ms on 1000 sentences<br>
    <br>
    So thanks again for your suggestion; it was definitely useful :-).<br>
    <br>
    Regards,<br>
    <br>
    Bruno.<br>
    <br>
    Le 19/08/2011 17:25, Wolfgang Laun a &eacute;crit&nbsp;:
    <blockquote
cite="mid:CANaj1Le17YGBqnoDWK6pzey7psxaJmRS4mFiEZk65Cqq++ssSA@mail.gmail.com"
      type="cite">2011/8/19 Bruno Freudensprung <span dir="ltr">&lt;<a
          moz-do-not-send="true"
          href="mailto:bruno.freudensprung@temis.com">bruno.freudensprung@temis.com</a>&gt;</span><br>
      <div class="gmail_quote">
        <blockquote class="gmail_quote" style="border-left: 1px solid
          rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left:
          1ex;">
          <div bgcolor="#ffffff" text="#000000"> <br>
            I am not sure I understand what you mean by "random order"
            but I guess it has to do with my ArrayList result type.<br>
            What I had in mind is to put all sentences in a TreeSet
            during the "action" method, and finally issue an ArrayList
            result object by iterating over the TreeSet and grouping
            sentences.<br>
          </div>
        </blockquote>
        <div><br>
          Heh :) I clean forgot that I had done this sort of thing not
          too long ago.<br>
          &nbsp;</div>
        <blockquote class="gmail_quote" style="border-left: 1px solid
          rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left:
          1ex;">
          <div bgcolor="#ffffff" text="#000000">My first guess was that
            such an accumulator might be faster than a construction of
            windows using rules.<br>
            However I admit your suggestion is very elegant, and I thank
            you for that! I am probably still too imperative-minded...<br>
          </div>
        </blockquote>
        <div><br>
          Well, a procedural solution would be a reasonable alternative
          for this problem.<br>
          <br>
          -W<br>
          &nbsp;</div>
        <blockquote class="gmail_quote" style="border-left: 1px solid
          rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left:
          1ex;">
          <div bgcolor="#ffffff" text="#000000"> <br>
            Regards,<br>
            <br>
            Bruno.<br>
            <br>
            Le 19/08/2011 16:05, Wolfgang Laun a &eacute;crit&nbsp;:
            <div>
              <div class="h5">
                <blockquote type="cite">How would you write
                  "buildwindows", given that its "action" method would
                  be called once for each Sentence, in random order?<br>
                  <br>
                  It's very simple to write a very small set of rules to
                  construct all SentenceWindow facts of size 1 and then
                  to extend them to any desired size, depending on some
                  parameter.<br>
                  1. Given a Sentence and no Window beginning with it,
                  create a Window of length 1.<br>
                  2. Given a Window of size n &lt; desiredSize and given
                  a Sentence immediately following it, extend the Window
                  to one of size n+1.<br>
                  3a. For any Window of desiredSize, inspect it for
                  "closely situated ManualAnnotations".<br>
                  3b. If ManualAnnotations have been associated with
                  their containing Sentences up-front, you just need to
                  find Windows with more than 1 ManualAnnotation, adding
                  them in the RHS of rule 2 above.<br>
                  <br>
                  -W<br>
                  <br>
                  <br>
                  <div class="gmail_quote"> 2011/8/19 Bruno
                    Freudensprung <span dir="ltr">&lt;<a
                        moz-do-not-send="true"
                        href="mailto:bruno.freudensprung@temis.com"
                        target="_blank">bruno.freudensprung@temis.com</a>&gt;</span><br>
                    <blockquote class="gmail_quote" style="border-left:
                      1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt
                      0.8ex; padding-left: 1ex;">
                      <div bgcolor="#ffffff" text="#000000"> <br>
                        Hi Wolfgang,<br>
                        <br>
                        Thanks for your answer.<br>
                        Sentences are not contiguous (might be some
                        space characters in between) but manual
                        annotations cannot overlap sentences (interpret
                        "overlap" in terms of Drools Fusion
                        terminology).<br>
                        If I had an "inside" operator, do you think the
                        following accumulate option could be better? <br>
                        <br>
                        <tt>when<br>
                          &nbsp;&nbsp;&nbsp; <b>$result : ArrayList() from accumulate
                            ( $s: Sentence(), buildwindows($s))</b><br>
                        </tt><tt>&nbsp;&nbsp;&nbsp;<b> $w : SentenceWindows () </b></tt><b><tt>from


                            $result</tt></b><br>
                        <tt> </tt><tt>&nbsp;&nbsp;&nbsp; a1 : ManualAnnotation (this <b>inside
                          </b>$w)<br>
                          &nbsp;&nbsp;&nbsp; a2 : ManualAnnotation (this != a1, </tt><tt>this
                          <b>inside </b>$w</tt><tt>)<br>
                        </tt><tt>then<br>
                          &nbsp;&nbsp;&nbsp; ... </tt><tt>do something with a1 and a2
                          since they are "close" to each other</tt><br>
                        <tt> end</tt><br>
                        <br>
                        Does anyone know something about accumulator
                        parametrization (looking at the source code it
                        does not seem to be possible, though)?<br>
                        Maybe a syntax inspired of operator
                        parametrization could be nice:<br>
                        <br>
                        <tt> &nbsp;&nbsp;&nbsp; $result : ArrayList() from accumulate (
                          $s: Sentence(), <b> buildwindows[3]($s)</b>)<br>
                        </tt><br>
                        Best regards,<br>
                        <br>
                        Bruno.<br>
                        <br>
                        Le 19/08/2011 13:55, Wolfgang Laun a &eacute;crit&nbsp;:
                        <blockquote type="cite">There are some details
                          that one should consider before deciding on a
                          particular implementation technique.<br>
                          <ul>
                            <li>Are all Sentences contiguous, i.e.,
                              s1.end = pred( s2.start )</li>
                            <li>Can a ManualAnnotation start on one
                              Sentence and end in the next or any
                              further successor?</li>
                          </ul>
                          As in all problems where constraints depend on
                          an order between facts, performance is going
                          to be a problem with increasing numbers of
                          Sentences and ManualAnnotations.<br>
                          <br>
                          Your accumulate plan could be a very
                          inefficient approach. Creating O(N*N) pairs
                          and then looking for an overlapping window is
                          much worse than looking at each window, for
                          instance. But it depends on the expected
                          numbers for both.<br>
                          <br>
                          -W<br>
                          <br>
                          <br>
                          <br>
                          <div class="gmail_quote">2011/8/19 Bruno
                            Freudensprung <span dir="ltr">&lt;<a
                                moz-do-not-send="true"
                                href="mailto:bruno.freudensprung@temis.com"
                                target="_blank">bruno.freudensprung@temis.com</a>&gt;</span><br>
                            <blockquote class="gmail_quote"
                              style="border-left: 1px solid rgb(204,
                              204, 204); margin: 0pt 0pt 0pt 0.8ex;
                              padding-left: 1ex;">
                              <div bgcolor="#ffffff" text="#000000">
                                Hello,<br>
                                <br>
                                I am trying to implement rules handling
                                "Sentence", "ManualAnnotation" objects
                                (imagine someone highligthing words of
                                the document). Basically "Sentence"
                                objects have "start" and "end" positions
                                (fields) into the text of a document,
                                and they are Comparable according to
                                their location into the document.<br>
                                <br>
                                I need to write rules using the notion
                                "window of consecutive sentences". <br>
                                <br>
                                Basically I am not very interested by
                                those "SentenceWindow" objects, I just
                                need them to define a kind of proximity
                                between "ManualAnnotation" objects.<br>
                                What I eventually need in the "when" of
                                my rule is something like:<br>
                                <br>
                                <tt>when<br>
                                  &nbsp;&nbsp;&nbsp; ... maybe something creating the
                                  windows<br>
                                  &nbsp;&nbsp;&nbsp; a1 : ManualAnnotation ()<br>
                                  &nbsp;&nbsp;&nbsp; a2 : ManualAnnotation (this != a1)<br>
                                  &nbsp;&nbsp;&nbsp; SentenceWindow (this includes a1,
                                  this includes a2)<br>
                                  then<br>
                                  &nbsp;&nbsp;&nbsp; ... do something with a1 and a2
                                  since they are "close" to each other<br>
                                  end</tt><br>
                                <br>
                                As I don't know the "internals" of
                                Drools, I would like to have your
                                opinion about what the best "idiom":<br>
                                <ul>
                                  <li>create all SentenceWindow objects
                                    and insert them in the working
                                    memory, then write rules against all
                                    the facts (SentenceWindow and
                                    ManualAnnotation)<br>
                                  </li>
                                  <li>implement an accumulator that will
                                    create a list of&nbsp; SentenceWindow
                                    object </li>
                                </ul>
                                <br>
                                The first option could look like:<br>
                                <br>
                                <code></code><code>rule "Create sentence
                                  windows"<br>
                                  &nbsp; &nbsp;when<br>
                                  &nbsp;&nbsp;&nbsp; &nbsp; # find 3 consecutive sentences<br>
                                  &nbsp; &nbsp;&nbsp; &nbsp;s1 : Sentence()<br>
                                  &nbsp; &nbsp;&nbsp; &nbsp;s2 : Sentence(this &gt; s1)<br>
                                  &nbsp; &nbsp;&nbsp; &nbsp;s3 : Sentence(this &gt; s2)<br>
                                  &nbsp; &nbsp;&nbsp; &nbsp;not Sentence(this != s2
                                  &amp;&amp; &gt; s1 &amp;&amp; &lt; s3)<br>
                                  &nbsp; &nbsp;then<br>
                                  &nbsp; &nbsp;&nbsp; &nbsp;SentenceWindow swindow = new
                                  SentenceWindow();<br>
                                  &nbsp; &nbsp;&nbsp; &nbsp;swindow.setStart(s1.getStart());<br>
                                  &nbsp; &nbsp;&nbsp; &nbsp;swindow.setTheend(s3.getEnd());<br>
                                  &nbsp; &nbsp;&nbsp; &nbsp;insert(swindow);<br>
                                  end</code><br>
                                <br>
                                ... Then use the first rule "as is".<br>
                                <br>
                                The accumulator option could look like
                                (I am not really sure the syntax is
                                correct) :<br>
                                <br>
                                <tt>when<br>
                                  &nbsp;&nbsp;&nbsp; <b>$result : ArrayList() from
                                    accumulate ( $s: Sentence(),
                                    buildwindows($s))</b><br>
                                </tt><tt>&nbsp;&nbsp;&nbsp; a1 : ManualAnnotation ()<br>
                                  &nbsp;&nbsp;&nbsp; a2 : ManualAnnotation (this != a1)<br>
                                  &nbsp;&nbsp;&nbsp;<b> SentenceWindows (this includes
                                    a1, this includes a2) </b></tt><b><tt>from
                                    $result</tt></b><br>
                                <tt> then<br>
                                  &nbsp;&nbsp;&nbsp; ... </tt><tt>do something with a1
                                  and a2 since they are "close" to each
                                  other</tt><br>
                                <tt> end</tt><br>
                                <br>
                                Is it possible to decide if one way is
                                best than the other?<br>
                                <br>
                                And one last question: it is possible to
                                "parametrize" an accumulator (in order
                                to provide the number of sentences that
                                should be put in the windows)?<br>
                                I mean something like:<br>
                                <br>
                                <tt>when<br>
                                  &nbsp;&nbsp;&nbsp; $result : ArrayList() from
                                  accumulate ( $s: Sentence(), <b>buildwindows(3,</b>
                                  $s))<br>
                                </tt><br>
                                <br>
                                Thanks in advance for you insights,<br>
                                <br>
                                Best regards,<br>
                                <font color="#888888"> <br>
                                  Bruno.<br>
                                </font></div>
                              <br>
_______________________________________________<br>
                              rules-users mailing list<br>
                              <a moz-do-not-send="true"
                                href="mailto:rules-users@lists.jboss.org"
                                target="_blank">rules-users@lists.jboss.org</a><br>
                              <a moz-do-not-send="true"
                                href="https://lists.jboss.org/mailman/listinfo/rules-users"
                                target="_blank">https://lists.jboss.org/mailman/listinfo/rules-users</a><br>
                              <br>
                            </blockquote>
                          </div>
                          <br>
                          <pre><fieldset></fieldset>
_______________________________________________
rules-users mailing list
<a moz-do-not-send="true" href="mailto:rules-users@lists.jboss.org" target="_blank">rules-users@lists.jboss.org</a>
<a moz-do-not-send="true" href="https://lists.jboss.org/mailman/listinfo/rules-users" target="_blank">https://lists.jboss.org/mailman/listinfo/rules-users</a>
</pre>
                        </blockquote>
                        <br>
                      </div>
                      <br>
                      _______________________________________________<br>
                      rules-users mailing list<br>
                      <a moz-do-not-send="true"
                        href="mailto:rules-users@lists.jboss.org"
                        target="_blank">rules-users@lists.jboss.org</a><br>
                      <a moz-do-not-send="true"
                        href="https://lists.jboss.org/mailman/listinfo/rules-users"
                        target="_blank">https://lists.jboss.org/mailman/listinfo/rules-users</a><br>
                      <br>
                    </blockquote>
                  </div>
                  <br>
                  <pre><fieldset></fieldset>
_______________________________________________
rules-users mailing list
<a moz-do-not-send="true" href="mailto:rules-users@lists.jboss.org" target="_blank">rules-users@lists.jboss.org</a>
<a moz-do-not-send="true" href="https://lists.jboss.org/mailman/listinfo/rules-users" target="_blank">https://lists.jboss.org/mailman/listinfo/rules-users</a>
</pre>
                </blockquote>
                <br>
              </div>
            </div>
          </div>
          <br>
          _______________________________________________<br>
          rules-users mailing list<br>
          <a moz-do-not-send="true"
            href="mailto:rules-users@lists.jboss.org">rules-users@lists.jboss.org</a><br>
          <a moz-do-not-send="true"
            href="https://lists.jboss.org/mailman/listinfo/rules-users"
            target="_blank">https://lists.jboss.org/mailman/listinfo/rules-users</a><br>
          <br>
        </blockquote>
      </div>
      <br>
      <pre wrap="">
<fieldset class="mimeAttachmentHeader"></fieldset>
_______________________________________________
rules-users mailing list
<a class="moz-txt-link-abbreviated" href="mailto:rules-users@lists.jboss.org">rules-users@lists.jboss.org</a>
<a class="moz-txt-link-freetext" href="https://lists.jboss.org/mailman/listinfo/rules-users">https://lists.jboss.org/mailman/listinfo/rules-users</a>
</pre>
    </blockquote>
    &nbsp;&nbsp;&nbsp; <br>
  </body>
</html>