<div dir="ltr"><div>IMO, indexing should be eventually consistent, as this offers the best performance.<br><br></div><div>On tx-caches, although Lucene has hooks to be enlisted in a transaction [1], some backends (elasticsearch) don&#39;t<br>expose this, and Hibernate Search by design doesn&#39;t make use of it. So currently we must deal with inconsistencies <br></div><div>after the fact: checking for nulls, mismatched types and so on.<br></div><div><br>[1] <a href="https://lucene.apache.org/core/6_0_1/core/org/apache/lucene/index/TwoPhaseCommit.html">https://lucene.apache.org/core/6_0_1/core/org/apache/lucene/index/TwoPhaseCommit.html</a><br><br><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Jul 28, 2017 at 1:59 PM, Adrian Nistor <span dir="ltr">&lt;<a href="mailto:anistor@redhat.com" target="_blank">anistor@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">My feeling regarding this was to accept such inconsistencies, but maybe<br>

I&#39;m wrong. I&#39;ve always regarded indexing as being async in general, even<br>

though it did behave as if being sync in some not so rare circumstances,<br>

which probably made people believe it is expected to be sync in general.<br>

I&#39;m curious what Sanne and Gustavo have in mind.<br>

<br>

Please note that updating the index synchronously during tx commit was<br>

always regarded as a performance bottleneck, so it was out of the<br>

question. <br></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">And that would not always work anyway, it all depends on the<br>

underlying indexing technology. For example when using HS with elastic<br>

search you have to accept that elastic indexing is always async.<br>

<br>

And there might not be an index at all. It&#39;s very possible that the<br>

query runs unindexed. In that case it will use distributed streams which<br>

have their own transaction issues.<br>

<br>

In the past we had some bugs were a matching entry was deleted/evicted<br>

right before the search results were returned to the user, so loading of<br>

those values failed in a silent way. Those queries mistakenly returned<br>

some unexpected nulls among other valid results. The fix was to just<br>

filter out those nulls. We could enhance that to double check that the<br>

returned entry is indeed of the requested type, to also cover the issue<br>

that you encountered.<br>

<span class="gmail-HOEnZb"><font color="#888888"><br>

Adrian<br>

</font></span><div class="gmail-HOEnZb"><div class="gmail-h5"><br>

On 07/28/2017 01:38 PM, Radim Vansa wrote:<br>

&gt; Hi,<br>

&gt;<br>

&gt; while working on ISPN-7806 I am wondering how should queries work with<br>

&gt; transactions. Right now it seems that updates to index are done during<br>

&gt; either regular command execution (on originator [A]) or prepare command<br>

&gt; on remote nodes [B]. Both of these cause rolled-back transactions to be<br>

&gt; seen, so these must be treated as bugs [C].<br>

&gt;<br>

&gt; If we index the data after committing the transaction, there would be a<br>

&gt; time window when we could see the updated entries but the index would<br>

&gt; not reflect that. That might be acceptable limitation if a<br>

&gt; query-matching misses some entity, but it&#39;s also possible that we<br>

&gt; retrieve the query result key-set and then (after retrieving full<br>

&gt; entities) we return something that does not match the query. One of the<br>

&gt; reproducers for ISPN-7806 I&#39;ve written [1] triggers a situation where<br>

&gt; listing all Persons could return Animal (different entity type), so I<br>

&gt; think that there&#39;s no validity post-check (though these reproducers<br>

&gt; don&#39;t use transactions).<br>

&gt;<br>

&gt; Therefore, I wonder if the index should contain only the key; maybe we<br>

&gt; should store an unique version and invalidate the query if some of the<br>

&gt; entries has changed.<br>

&gt;<br>

&gt; If we index the data before committing the transaction, similar<br>

&gt; situation could happen: the index will return keys for entities that<br>

&gt; will match in the future but the actually returned list will contain<br>

&gt; stale entities.<br>

&gt;<br>

&gt; What&#39;s the overall plan? Do we just accept inconsistencies? In that<br>

&gt; case, please add a verbose statement in docs and point me to that.<br>

&gt;<br>

&gt; And if I&#39;ve misinterpreted something and raised the red flag in error,<br>

&gt; please let me know.<br>

&gt;<br>

&gt; Radim<br>

&gt;<br>

&gt; [A] This seems to be a regression after moving towards async<br>

&gt; interceptors - our impl of<br>

&gt; org.hibernate.search.backend.<wbr>TransactionContext is incorrectly bound to<br>

&gt; TransactionManager. Then we seem to be running out of transaction and<br>

&gt; are happy to index it right away. The thread that executes the<br>

&gt; interceptor handler is also dependent on ownership (due to remote<br>

&gt; LockCommand execution), so I think that it does not fail the local-mode<br>

&gt; tests.<br>

&gt;<br>

&gt; [B] ... and it does so twice as a regression after ISPN-7840 but that&#39;s<br>

&gt; easy to fix.<br>

&gt;<br>

&gt; [C] Indexing in prepare command was OK before ISPN-7840 with pessimistic<br>

&gt; locking which does not send the CommitCommand, but now that the QI has<br>

&gt; been moved below EWI it means that we&#39;re indexing before storing the<br>

&gt; actual values. Optimistic locking was not correct, though.<br>

&gt;<br>

&gt; [1]<br>

&gt; <a href="https://github.com/rvansa/infinispan/commit/1d62c9b84888c7ac21a9811213b5657aa44ff546" rel="noreferrer" target="_blank">https://github.com/rvansa/<wbr>infinispan/commit/<wbr>1d62c9b84888c7ac21a9811213b565<wbr>7aa44ff546</a><br>

&gt;<br>

&gt;<br>

<br>

</div></div><div class="gmail-HOEnZb"><div class="gmail-h5">______________________________<wbr>_________________<br>

infinispan-dev mailing list<br>

<a href="mailto:infinispan-dev@lists.jboss.org">infinispan-dev@lists.jboss.org</a><br>

<a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" rel="noreferrer" target="_blank">https://lists.jboss.org/<wbr>mailman/listinfo/infinispan-<wbr>dev</a><br>

</div></div></blockquote></div><br></div></div></div>