<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">Yup, I also meant 'eventually
      consistent' when saying such inconsistencies should be acceptable.
      At some point in time after transactions have been committed and
      topology changes have been handled (state transfer completed) and
      we have a steady state we should see a consistent index when
      querying. <br>
      <br>
      On 07/31/2017 11:41 AM, Gustavo Fernandes wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAH8Ud1E_qMNdX1ofPtc7UC3wixyhFxkxQzeJ324KMOWZLy_p6A@mail.gmail.com">
      <div dir="ltr">
        <div>IMO, indexing should be eventually consistent, as this
          offers the best performance.<br>
          <br>
        </div>
        <div>On tx-caches, although Lucene has hooks to be enlisted in a
          transaction [1], some backends (elasticsearch) don't<br>
          expose this, and Hibernate Search by design doesn't make use
          of it. So currently we must deal with inconsistencies <br>
        </div>
        <div>after the fact: checking for nulls, mismatched types and so
          on.<br>
        </div>
        <div><br>
          [1] <a
href="https://lucene.apache.org/core/6_0_1/core/org/apache/lucene/index/TwoPhaseCommit.html"
            moz-do-not-send="true">https://lucene.apache.org/core/6_0_1/core/org/apache/lucene/index/TwoPhaseCommit.html</a><br>
          <br>
          <div class="gmail_extra"><br>
            <div class="gmail_quote">On Fri, Jul 28, 2017 at 1:59 PM,
              Adrian Nistor <span dir="ltr">&lt;<a
                  href="mailto:anistor@redhat.com" target="_blank"
                  moz-do-not-send="true">anistor@redhat.com</a>&gt;</span>
              wrote:<br>
              <blockquote class="gmail_quote" style="margin:0px 0px 0px
                0.8ex;border-left:1px solid
                rgb(204,204,204);padding-left:1ex">My feeling regarding
                this was to accept such inconsistencies, but maybe<br>
                I'm wrong. I've always regarded indexing as being async
                in general, even<br>
                though it did behave as if being sync in some not so
                rare circumstances,<br>
                which probably made people believe it is expected to be
                sync in general.<br>
                I'm curious what Sanne and Gustavo have in mind.<br>
                <br>
                Please note that updating the index synchronously during
                tx commit was<br>
                always regarded as a performance bottleneck, so it was
                out of the<br>
                question. <br>
              </blockquote>
              <blockquote class="gmail_quote" style="margin:0px 0px 0px
                0.8ex;border-left:1px solid
                rgb(204,204,204);padding-left:1ex">And that would not
                always work anyway, it all depends on the<br>
                underlying indexing technology. For example when using
                HS with elastic<br>
                search you have to accept that elastic indexing is
                always async.<br>
                <br>
                And there might not be an index at all. It's very
                possible that the<br>
                query runs unindexed. In that case it will use
                distributed streams which<br>
                have their own transaction issues.<br>
                <br>
                In the past we had some bugs were a matching entry was
                deleted/evicted<br>
                right before the search results were returned to the
                user, so loading of<br>
                those values failed in a silent way. Those queries
                mistakenly returned<br>
                some unexpected nulls among other valid results. The fix
                was to just<br>
                filter out those nulls. We could enhance that to double
                check that the<br>
                returned entry is indeed of the requested type, to also
                cover the issue<br>
                that you encountered.<br>
                <span class="gmail-HOEnZb"><font color="#888888"><br>
                    Adrian<br>
                  </font></span>
                <div class="gmail-HOEnZb">
                  <div class="gmail-h5"><br>
                    On 07/28/2017 01:38 PM, Radim Vansa wrote:<br>
                    &gt; Hi,<br>
                    &gt;<br>
                    &gt; while working on ISPN-7806 I am wondering how
                    should queries work with<br>
                    &gt; transactions. Right now it seems that updates
                    to index are done during<br>
                    &gt; either regular command execution (on originator
                    [A]) or prepare command<br>
                    &gt; on remote nodes [B]. Both of these cause
                    rolled-back transactions to be<br>
                    &gt; seen, so these must be treated as bugs [C].<br>
                    &gt;<br>
                    &gt; If we index the data after committing the
                    transaction, there would be a<br>
                    &gt; time window when we could see the updated
                    entries but the index would<br>
                    &gt; not reflect that. That might be acceptable
                    limitation if a<br>
                    &gt; query-matching misses some entity, but it's
                    also possible that we<br>
                    &gt; retrieve the query result key-set and then
                    (after retrieving full<br>
                    &gt; entities) we return something that does not
                    match the query. One of the<br>
                    &gt; reproducers for ISPN-7806 I've written [1]
                    triggers a situation where<br>
                    &gt; listing all Persons could return Animal
                    (different entity type), so I<br>
                    &gt; think that there's no validity post-check
                    (though these reproducers<br>
                    &gt; don't use transactions).<br>
                    &gt;<br>
                    &gt; Therefore, I wonder if the index should contain
                    only the key; maybe we<br>
                    &gt; should store an unique version and invalidate
                    the query if some of the<br>
                    &gt; entries has changed.<br>
                    &gt;<br>
                    &gt; If we index the data before committing the
                    transaction, similar<br>
                    &gt; situation could happen: the index will return
                    keys for entities that<br>
                    &gt; will match in the future but the actually
                    returned list will contain<br>
                    &gt; stale entities.<br>
                    &gt;<br>
                    &gt; What's the overall plan? Do we just accept
                    inconsistencies? In that<br>
                    &gt; case, please add a verbose statement in docs
                    and point me to that.<br>
                    &gt;<br>
                    &gt; And if I've misinterpreted something and raised
                    the red flag in error,<br>
                    &gt; please let me know.<br>
                    &gt;<br>
                    &gt; Radim<br>
                    &gt;<br>
                    &gt; [A] This seems to be a regression after moving
                    towards async<br>
                    &gt; interceptors - our impl of<br>
                    &gt; org.hibernate.search.backend.<wbr>TransactionContext
                    is incorrectly bound to<br>
                    &gt; TransactionManager. Then we seem to be running
                    out of transaction and<br>
                    &gt; are happy to index it right away. The thread
                    that executes the<br>
                    &gt; interceptor handler is also dependent on
                    ownership (due to remote<br>
                    &gt; LockCommand execution), so I think that it does
                    not fail the local-mode<br>
                    &gt; tests.<br>
                    &gt;<br>
                    &gt; [B] ... and it does so twice as a regression
                    after ISPN-7840 but that's<br>
                    &gt; easy to fix.<br>
                    &gt;<br>
                    &gt; [C] Indexing in prepare command was OK before
                    ISPN-7840 with pessimistic<br>
                    &gt; locking which does not send the CommitCommand,
                    but now that the QI has<br>
                    &gt; been moved below EWI it means that we're
                    indexing before storing the<br>
                    &gt; actual values. Optimistic locking was not
                    correct, though.<br>
                    &gt;<br>
                    &gt; [1]<br>
                    &gt; <a
href="https://github.com/rvansa/infinispan/commit/1d62c9b84888c7ac21a9811213b5657aa44ff546"
                      rel="noreferrer" target="_blank"
                      moz-do-not-send="true">https://github.com/rvansa/<wbr>infinispan/commit/<wbr>1d62c9b84888c7ac21a9811213b565<wbr>7aa44ff546</a><br>
                    &gt;<br>
                    &gt;<br>
                    <br>
                  </div>
                </div>
                <div class="gmail-HOEnZb">
                  <div class="gmail-h5">______________________________<wbr>_________________<br>
                    infinispan-dev mailing list<br>
                    <a href="mailto:infinispan-dev@lists.jboss.org"
                      moz-do-not-send="true">infinispan-dev@lists.jboss.org</a><br>
                    <a
                      href="https://lists.jboss.org/mailman/listinfo/infinispan-dev"
                      rel="noreferrer" target="_blank"
                      moz-do-not-send="true">https://lists.jboss.org/<wbr>mailman/listinfo/infinispan-<wbr>dev</a><br>
                  </div>
                </div>
              </blockquote>
            </div>
            <br>
          </div>
        </div>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
infinispan-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:infinispan-dev@lists.jboss.org">infinispan-dev@lists.jboss.org</a>
<a class="moz-txt-link-freetext" href="https://lists.jboss.org/mailman/listinfo/infinispan-dev">https://lists.jboss.org/mailman/listinfo/infinispan-dev</a></pre>
    </blockquote>
    <p><br>
    </p>
  </body>
</html>