One more issue that supports my previous statement about the "multi-tenancy" feature should not be worked around using "custom sharding" feature (I've tried the idea to shards the data based on tenantId):
-
The ShardIdentifierProvider must implement method loadInitialShardNames, which require us to query all the tenant records to be cached in the initialization process. This approach is not efficient, especially for application with large user/tenant base (imagine if we have more than 1000 tenants), and it tends to have a linear performance impact as the user base grows.
The dynamic sharding is appropriate for relatively small sets of shardIdentifiers, and the shardIdentifiers count growth should be low. From the documentation (http://docs.jboss.org/hibernate/search/4.4/reference/en-US/html_single/#advanced-features-sharding) said that the feature is tend to be used if the index size of one entity too big and slowing down the application, and I agree with that.
I'm a new user to Hibernate Search, there is a big chance I have a wrong picture on how the Hibernate Search works. CMIIW 
I have some ideas (a very draft idea, have not tried to implement it yet) to bring the multi-tenancy support to hibernate search as follow:
-
The IndexManager.getDirectoryProvider() should return different DirectoryProvider according to the tenantId from the hibernate session passed on while creating FullTextSession.
With this changes, the index files will be saved to different path (relative to base index path) according to the tenantId
-
The ShardIdentifierProvider should be able to retrieve the related tenantId (from the hibernate search passed on while creating FullTextSession), so it can create appropriate dynamic shardIdentifier according to the related tenant data.
Anyway, thanks a lot Davide make work on multi-tenancy support in MassIndexer.
|