[JIRA] (HSEARCH-3872) Simplify and improve ordering and parallelism of Elasticsearch indexing
by Yoann Rodière (JIRA)
Yoann Rodière ( https://hibernate.atlassian.net/secure/ViewProfile.jspa?accountId=557058%... ) *created* an issue
Hibernate Search ( https://hibernate.atlassian.net/browse/HSEARCH?atlOrigin=eyJpIjoiNjc1Y2E2... ) / Improvement ( https://hibernate.atlassian.net/browse/HSEARCH-3872?atlOrigin=eyJpIjoiNjc... ) HSEARCH-3872 ( https://hibernate.atlassian.net/browse/HSEARCH-3872?atlOrigin=eyJpIjoiNjc... ) Simplify and improve ordering and parallelism of Elasticsearch indexing ( https://hibernate.atlassian.net/browse/HSEARCH-3872?atlOrigin=eyJpIjoiNjc... )
Issue Type: Improvement Assignee: Yoann Rodière ( https://hibernate.atlassian.net/secure/ViewProfile.jspa?accountId=557058%... ) Components: backend-elasticsearch Created: 26/Mar/2020 01:53 AM Fix Versions: 6.0.0.Beta-backlog-high-priority Priority: Major Reporter: Yoann Rodière ( https://hibernate.atlassian.net/secure/ViewProfile.jspa?accountId=557058%... )
One simplification we could apply in particular is to only ever execute indexing works (Index/Delete) in bulks, even if there's only one. That shouldn't affect performance too much, and that would definitely make the code simpler.
When that's done, many of the improvements implemented in the Lucene backend as part of HSEARCH-3822 ( https://hibernate.atlassian.net/browse/HSEARCH-3822 ) In Progress could be applied to the Elasticsearch backend as well:
* Queue works instead of worksets (simplifies configuration, e.g. HSEARCH-3575 ( https://hibernate.atlassian.net/browse/HSEARCH-3575 ) Open will be easier to implement)
* Use a single thread pool for the whole backend (share resources across indexes)
* Do not batch works that don't benefit from batching, e.g. non-bulkable works such as purge, search queries, ... In the case of Elasticsearch, that would mean submitting them to the REST client immediately when they are submitted to the orchestrator.
* Maybe, use multiple queues per orchestrator in order to execute multiple works for the same index in parallel
* Maybe, move to a common, global orchestrator for indexing
* More?
( https://hibernate.atlassian.net/browse/HSEARCH-3872#add-comment?atlOrigin... ) Add Comment ( https://hibernate.atlassian.net/browse/HSEARCH-3872#add-comment?atlOrigin... )
Get Jira notifications on your phone! Download the Jira Cloud app for Android ( https://play.google.com/store/apps/details?id=com.atlassian.android.jira.... ) or iOS ( https://itunes.apple.com/app/apple-store/id1006972087?pt=696495&ct=EmailN... ) This message was sent by Atlassian Jira (v1001.0.0-SNAPSHOT#100122- sha1:93a3ad8 )
4 years, 9 months