]
Sanne Grinovero commented on ISPN-5103:
---------------------------------------
As a reminder of our IRC chat:
it seems this would be safe as Infinispan Query handles the identifiers in a slightly
different way than Hibernate Search / ORM : the {{org.infinispan.query.Transformer}}
guarantees a unique relation between each encoded id and makes it possible to reconstruct
from it not only the ID instance but also the {{Transformer}} implementation to be used
for the transformation itself.
A similar optimisation wouldn't be always safe in ORM world as the
{{org.hibernate.search.bridge.TwoWayFieldBridge}} needs to be known, or we'd need to
be sure that different {{FieldBridge}} implementations would encode values in some unique
way.
For example we'd have a problem when mapping Integer(5) and Long(5) using keyword
encoding as they would both map to "5".
Inefficient index updates cause high cost merges and increase overall
latency
-----------------------------------------------------------------------------
Key: ISPN-5103
URL:
https://issues.jboss.org/browse/ISPN-5103
Project: Infinispan
Issue Type: Enhancement
Components: Embedded Querying
Affects Versions: 7.0.2.Final, 7.1.0.Alpha1
Reporter: Gustavo Fernandes
Assignee: Gustavo Fernandes
Currently every change to the index is done Lucene-wise combining two operations:
* Delete by query, using a boolean query on the id plus the entity class
* Add
Under high load, specially during merges those numerous deletes provoke very long delays
causing high latency.
We should instead use a simple Lucene Update to add/change documents, since internally it
translates to a Delete by term plus an Add operation, and delete by terms are extremely
efficient in Lucene.
Some local tests showed average latency of updating the index using this strategy to drop
4 times, both for the SYNC and ASYNC backends
With relation to sharing the index between entities, which was the original motivation of
the Delete by query plus add strategy, we have two scenarios:
* Same cache with multiple entity types: that's a non-issue, since obviously
there's no id collision in this case
* Different caches with the same index: this scenario happens when different caches
shares the same index, for ex:
{code}
@Indexed(indexName=common)
public class Country { ... }
@Indexed(indexName=common)
public class Currency { ... }
cm.getCache("currencies").put(1, new Currency(...))
cm.getCache("countries").put(1, new Country(...))
{code}
This would require a delete by query in order to persist both a Country and a Currency
with id=1.
It would also require setting "default.exclusive_index_use", "false",
with the associated cost of having to reopen the IndexWriter on every operation.
Given the performance gain of doing a simple Update is considerable, we should make the
corner case supported by extra configuration or alternatively, generate a unique
@ProvidedId, including the entity class or the cache name that work for all cases
described above.