TD;DR I'm afraid that, if we want to do this cleanly, it will require additions or changes to the following SPIs:
- org.hibernate.search.backend.spi.BackendQueueProcessor (in particular BackendQueueProcessor.applyWork(List<LuceneWork>, IndexingMonitor))
- org.hibernate.search.indexes.spi.IndexManager (in particular IndexManager.performOperations(List<LuceneWork>, IndexingMonitor))
The reason being that implementations of IndexManager don't have access to such thing as session-scoped settings. Which is normal, since sessions are an ORM concept. So we ought to pass those settings as method arguments. If we really need to get this done quickly, we may also take advantage of a thread local storing such session-scoped settings (in a Service for instance), but this solution seems less elegant to me (of course we can pass any argument through a thread local instead of the stack, but then the method signatures become less and less informative). So my questions are:
- am I completely wrong in my assessment?
- do we need this now or can it be pushed to 6.0?
- if we must do it now, is the threadlocal solution acceptable?
Full explanation We have two concepts at play here:
- Whether operations are transmitted to the backend synchronously. If they are, and if a power failure (which does not affect the backend) happens just after returning from the request processing, we know the operations will be executed anyway.
- Whether operations are executed synchronously. If they are, and we query the index just after returning from the request processing, we know the results will be up-to-date (the effects of our operations will be visible).
We implement, here and there, three modes of execution:
- Asynchronous, i.e. transmission-asynchronous and execution-asynchronous
- Transmission-synchronous only, i.e. transmission-synchronous and execution-asynchronous
- Synchronous, i.e. transmission-synchronous and execution-synchronous
We want to offer the possibility for Search-orm users to switch the post-commit behavior between transmission-synchronous only and synchronous with a session-scoped setting. When a transaction ends, the message is propagated this way:
- org.hibernate.search.backend.impl.PostTransactionWorkQueueSynchronization.afterCompletion(int) is called
- which calls org.hibernate.search.backend.impl.BatchedQueueingProcessor.performWorks(WorkQueue)
- which calls {{org.hibernate.search.backend.impl.WorkQueuePerIndexSplitter.commitOperations(IndexingMonitor)}
- which calls org.hibernate.search.backend.spi.BackendQueueProcessor.applyWork(List<LuceneWork>, IndexingMonitor)
That last method is where the JMS/local distinction is made. Letting the JMS processor aside, when works are triggered locally, the processor calls org.hibernate.search.indexes.spi.IndexManager.performOperations(List<LuceneWork>, IndexingMonitor), whose implementation may be synchronous, transmission-synchronous only or asynchronous depending on global configuration and implementation choices. Unless I'm mistaken, those IndexManager implementations do not have access to such thing as session-scoped settings, and obviously the method signature does not allow passing additional settings. For the record, here's what I noted when inspecting various IndexManager implementations:
- DirectoryBasedIndexManager is synchronous or asynchronous depending on global (application-scoped) settings
- NRTIndexManager seems to be synchronous, given that its documentation states "IndexReaders are able to inspect the unflushed changes still pending in the IndexWriter buffers"
- ElasticsearchIndexManager is synchronous
|