]
Adrian Nistor updated ISPN-4650:
--------------------------------
Fix Version/s: 7.0.0.CR1
(was: 7.0.0.Beta2)
MassIndexer should not use UpdateDocument when adding to Lucene
---------------------------------------------------------------
Key: ISPN-4650
URL:
https://issues.jboss.org/browse/ISPN-4650
Project: Infinispan
Issue Type: Enhancement
Components: Embedded Querying
Affects Versions: 7.0.0.Beta1
Reporter: Gustavo Fernandes
Assignee: Gustavo Fernandes
Fix For: 7.0.0.CR1
The MassIndexer currently causes a Delete plus and Add operation to hibernate search
backend.
Lucene buffers those deletes queries and during merge it tries to 'apply' those
deletes wasting a massive amount of time doing seeks and queries unnecessarily.
Since the mass indexer wipes the index at the beginning, it should simply issue an add
operation (or at least rely on Lucene atomic IndexWriter.updateDocument). Performance wise
this make a huge difference:
* indexing 50k documents brings down the indexing time from 195s to 33s
* indexing 200k documents brings down the indexing time from 600s to 55s