When a shard is added dynamically, the ShardIdentifierProvider (or custom ShardingStrategy, but that's deprecated) updates its internal state so that the next time we ask it which index managers should be queried for a given entity, it can answer precisely "these ones". The fact that the list of shards for a given index is maintained separately for each entity is a bit surprising, but I think it enables in particular an optimization when multiple entities use the same, dynamically sharded index: only shards containing the given entity are queried. Not sure this is really useful, but the optimization seems to be there on purpose. This state is used later, when a query targets an entity: the relevant index managers are retrieved, and the query is ran against each of them. Problem: when you have, say, three nodes, and an index manager is created on a slave, then this slave will be aware of the addition, and probably the master too, but the other slave will have no way to know there is a new index. The index may be replicated ultimately (I don't know how that works), but Hibernate Search won't have an index manager for this index, meaning that queries ran on that second slave will never, ever take the new index into consideration. So we would need:
- some kind of message broadcasted across the JMS/JGroups channel whenever a new index manager is created, so as to add the index managers on already running nodes too
- some way for newly added nodes to discover the state of existing index managers and initialize them locally
Note that because of that optimization mentioned above, we would also need a specific message when a new entity is added to an existing index manager (unless we remove that optimization, which I think would be a very good idea). Also note that currently, the state of the currently initialized index managers is managed by a user-provided implementation of ShardIdentifierProvider. We have no easy way to update this state, so we may want to solve
HSEARCH-2674 Open first. |