[infinispan-dev] Transactional consistency of query

Radim Vansa rvansa at redhat.com
Mon Jul 31 06:27:56 EDT 2017


On 07/31/2017 11:12 AM, Tristan Tarrant wrote:
> Shouldn't we use an appropriate conflict resolution strategy for this so
> that in case of partitions we repair the index ?

This is not about eventual consistency in case of partitions, just 
eventually publishing the change in the index after the transaction 
completes.

Making index consistent after a split brain (even with DENY_ALL policy 
some operations may end up in a half-complete state) is a completely 
different issue and I think nobody ever tried to deal with that.

R.

>
> Tristan
>
> On 7/31/17 10:41 AM, Gustavo Fernandes wrote:
>> IMO, indexing should be eventually consistent, as this offers the best
>> performance.
>>
>> On tx-caches, although Lucene has hooks to be enlisted in a transaction
>> [1], some backends (elasticsearch) don't
>> expose this, and Hibernate Search by design doesn't make use of it. So
>> currently we must deal with inconsistencies
>> after the fact: checking for nulls, mismatched types and so on.
>>
>> [1]
>> https://lucene.apache.org/core/6_0_1/core/org/apache/lucene/index/TwoPhaseCommit.html
>>
>>
>> On Fri, Jul 28, 2017 at 1:59 PM, Adrian Nistor <anistor at redhat.com
>> <mailto:anistor at redhat.com>> wrote:
>>
>>      My feeling regarding this was to accept such inconsistencies, but maybe
>>      I'm wrong. I've always regarded indexing as being async in general, even
>>      though it did behave as if being sync in some not so rare circumstances,
>>      which probably made people believe it is expected to be sync in general.
>>      I'm curious what Sanne and Gustavo have in mind.
>>
>>      Please note that updating the index synchronously during tx commit was
>>      always regarded as a performance bottleneck, so it was out of the
>>      question.
>>
>>      And that would not always work anyway, it all depends on the
>>      underlying indexing technology. For example when using HS with elastic
>>      search you have to accept that elastic indexing is always async.
>>
>>      And there might not be an index at all. It's very possible that the
>>      query runs unindexed. In that case it will use distributed streams which
>>      have their own transaction issues.
>>
>>      In the past we had some bugs were a matching entry was deleted/evicted
>>      right before the search results were returned to the user, so loading of
>>      those values failed in a silent way. Those queries mistakenly returned
>>      some unexpected nulls among other valid results. The fix was to just
>>      filter out those nulls. We could enhance that to double check that the
>>      returned entry is indeed of the requested type, to also cover the issue
>>      that you encountered.
>>
>>      Adrian
>>
>>      On 07/28/2017 01:38 PM, Radim Vansa wrote:
>>       > Hi,
>>       >
>>       > while working on ISPN-7806 I am wondering how should queries work
>>      with
>>       > transactions. Right now it seems that updates to index are done
>>      during
>>       > either regular command execution (on originator [A]) or prepare
>>      command
>>       > on remote nodes [B]. Both of these cause rolled-back transactions
>>      to be
>>       > seen, so these must be treated as bugs [C].
>>       >
>>       > If we index the data after committing the transaction, there
>>      would be a
>>       > time window when we could see the updated entries but the index would
>>       > not reflect that. That might be acceptable limitation if a
>>       > query-matching misses some entity, but it's also possible that we
>>       > retrieve the query result key-set and then (after retrieving full
>>       > entities) we return something that does not match the query. One
>>      of the
>>       > reproducers for ISPN-7806 I've written [1] triggers a situation where
>>       > listing all Persons could return Animal (different entity type), so I
>>       > think that there's no validity post-check (though these reproducers
>>       > don't use transactions).
>>       >
>>       > Therefore, I wonder if the index should contain only the key;
>>      maybe we
>>       > should store an unique version and invalidate the query if some
>>      of the
>>       > entries has changed.
>>       >
>>       > If we index the data before committing the transaction, similar
>>       > situation could happen: the index will return keys for entities that
>>       > will match in the future but the actually returned list will contain
>>       > stale entities.
>>       >
>>       > What's the overall plan? Do we just accept inconsistencies? In that
>>       > case, please add a verbose statement in docs and point me to that.
>>       >
>>       > And if I've misinterpreted something and raised the red flag in
>>      error,
>>       > please let me know.
>>       >
>>       > Radim
>>       >
>>       > [A] This seems to be a regression after moving towards async
>>       > interceptors - our impl of
>>       > org.hibernate.search.backend.TransactionContext is incorrectly
>>      bound to
>>       > TransactionManager. Then we seem to be running out of transaction and
>>       > are happy to index it right away. The thread that executes the
>>       > interceptor handler is also dependent on ownership (due to remote
>>       > LockCommand execution), so I think that it does not fail the
>>      local-mode
>>       > tests.
>>       >
>>       > [B] ... and it does so twice as a regression after ISPN-7840 but
>>      that's
>>       > easy to fix.
>>       >
>>       > [C] Indexing in prepare command was OK before ISPN-7840 with
>>      pessimistic
>>       > locking which does not send the CommitCommand, but now that the
>>      QI has
>>       > been moved below EWI it means that we're indexing before storing the
>>       > actual values. Optimistic locking was not correct, though.
>>       >
>>       > [1]
>>       >
>>      https://github.com/rvansa/infinispan/commit/1d62c9b84888c7ac21a9811213b5657aa44ff546
>>      <https://github.com/rvansa/infinispan/commit/1d62c9b84888c7ac21a9811213b5657aa44ff546>
>>       >
>>       >
>>
>>      _______________________________________________
>>      infinispan-dev mailing list
>>      infinispan-dev at lists.jboss.org <mailto:infinispan-dev at lists.jboss.org>
>>      https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>      <https://lists.jboss.org/mailman/listinfo/infinispan-dev>
>>
>>
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>


-- 
Radim Vansa <rvansa at redhat.com>
JBoss Performance Team



More information about the infinispan-dev mailing list