|
I have done the debugging of the source code, the long time is taken during the score procedure. During the search method of the org.apache.lucene.search.IndexSearcher the BulkScorer class works to calculate the score of the documents: this take a lot of time. On my little dataset (about 1.5 millions of docs) it takes about 1.5 secs. The collector of the BulkScorer has a facetcollector of 20000 documents. The score is done with a scan of five AtomicReaderContext.
The maxFacetCount is setted to 10. When I work on my little dataset I have an maximum heap space of 2GB. When I work with the medium dataset (50 millions of docs) I have an heap space of 24GB.
|