Message Title

Change By:	Yoann Rodière

These sizes are currently hardcoded:

# The maximum bulk size in the Elasticsearch backend
# The maximum number of worksets per batch in the serial orchestrator of the Elasticsearch backend
# The maximum number of worksets per batch in the parallel orchestrator of the Elasticsearch backend
# The maximum capacity of the workset queue in the serial orchestrator of the Elasticsearch backend
# The maximum capacity of the workset queue in the parallel orchestrator of the Elasticsearch backend
# The maximum number of worksets per batch in the write orchestrator of the Lucene indexes (a similar settings was "hibernate.search.batch_size" in Search 5, though it wasn't documented)
# The maximum capacity of the workset queue in the write orchestrator of the Lucene indexes (was configured through "hibernate.search.[default|<indexname>].max_queue_length" in Search 5, see https://docs.jboss.org/hibernate/search/5.11/reference/en-US/html_single/#lucene-indexing-performance)

We should address three problems:

# The hardcoded sizes may not be very good. For example we allow 5000 worksets to queue up for execution in the parallel orchestrator of the Elasticsearch backend. Each workset might contain several works. A tad too much, maybe?
# Even if we make the default values better, they'll never fit every use case. Users should be able to change them through configuration properties.
# - We should document how to pick a sensible size for queues. - => NO, let's wait until we have some experience. For now let's simply recommend that users test the performance of their application rather than picking arbitrary values.

For workset queues, keep in mind the queue should be at least equal to {{(estimated number of user threads)\*(estimated number of worksets created by each transaction)}}.

For workset queues, we might want to allow to set the capacity to "unlimited" for people who'd rather get an OOM error than block because the queue is full. For infinite capacity, a linked list as implemented in {{org.hibernate.search.backend.impl.lucene.MultiWriteDrainableLinkedList}} might help.

Note there was a configuration option for the maximum work queue length of the async executor in Search 5: see {{org.hibernate.search.indexes.impl.PropertiesParseHelper#extractMaxQueueSize}}. The sync executor, however, had a queue of unlimited capacity (a linked list).

Add Comment

Get Jira notifications on your phone! Download the Jira Cloud app for Android or iOS

This message was sent by Atlassian Jira