[hibernate-dev] HSearch: Using sharding and avoiding query on multiple shards

Emmanuel Bernard emmanuel at hibernate.org
Wed Jun 10 15:07:34 EDT 2009


On  Jun 3, 2009, at 13:27, Sanne Grinovero wrote:

> I am having a similar need in these days; this should be a very useful
> feature, but I'd like more something I could use with the existing API
> like
>
> enableFullTextFilter
> ( "MyShardsSelectionStrategy" ).setParameter( ... )
>
> a practical example:
> enableFullTextFilter( "LanguageFilter" ).setParameter( "IT-it" )
>
> The existing IndexShardingStrategy should be able to be smarter and
> have something like
>
> DirectoryProvider<?>[] getDirectoryProvidersForQuery( filters &&  
> options )

That sounds like an elegant approach but we need a way to make it easy  
to declare a filter as dump/shard-sensitive only (ie not force the  
user to write some Filter implementation). With this knowledge,  
HSearch could ignore the dump filter for the actual Lucene filtering  
operation.

>
> So a smart ShardingStrategy could do some selections considering this.
>
> I'm currently using sharding to shard my index on 25 different
> languages (using per-language stemmers), so this would
> be useful but I'd especially need to be able to "label" my different
> DirectoryProviders using String identifiers,
> I'd suggest to add a getName() to the DirectoryProvider interface: I
> would use that to store countrycodes and
> keep a map<String,DirectoryProvider> in my ShardingStrategy, so I can
> easily select the right DP when
> the LanguageFilter is enabled.

I am not too enthusiastic about that. I guess you can solve that if we  
make sure to override toString in DirectoryProviders in a meaningful  
way. But really you want a Map<FilterRepresentation, DP> with a smart  
equals impl for FilterRepresentation talking care of parameters.

>
> Another usage would be to shard an entity on an Enumerated property:
> in this case an appropriate ShardingStrategy
> could be provided by Search and auto-configured by reading the
> possible enum values: that would be a very easy way
> to enable sharding on an entity.

You mean for insertion? You can already do that by passing the enum  
value in your document and use
getDirectoryProviderForAddition(Class<?> entity, Serializable id,  
String idInString, Document document)


>
> Sanne
>
> 2009/6/3 Emmanuel Bernard <emmanuel at hibernate.org>:
>>
>>
>> Begin forwarded message:
>>
>> From: chase.seibert+opensubscriber at gmail.com
>> Date:  June 3, 2009 09:21:21  PDT
>> To: emmanuel at hibernate.org
>> Subject: Re: Re: [hibernate-dev] HSearch: Using sharding and  
>> avoiding query
>> on multiple shards
>> Reply-To: chase.seibert+opensubscriber at gmail.com
>> Emmanuel,
>>
>> Regarding HSEARCH-251, and
>> http://www.opensubscriber.com/message/hibernate-dev@lists.jboss.org/9770383.html
>>
>> Being able to query just a single shard or subset of shards would be
>> awesome. I was thinking of a similar API:
>>
>> IndexShardingStrategy:
>> public DirectoryProvider<?>[]
>> getDirectoryProviderForShard(int shardNum);
>>
>> FullTextQuery:
>> public void enableShardFilter(int shardNum);
>> public void enableShardFilters(int[] shardNums);
>>
>> FullTextQuery.buildSearcher() would need to be modified to call
>> getDirectoryProviderForShard() for each shardNum if shardNums are  
>> set,
>> otherwise it should continue to use  
>> getDirectoryProvidersForAllShards();
>>
>> Calling this API from a consumer's stand-point would look like:
>> FullTextQuery fullTextQuery =
>> fullTextSession.createFullTextQuery(luceneQuery, entityClass);
>> fullTextQuery.enableShardFilter(5);
>> fullTextQuery.list();
>>
>> This could be changed to pass named shards easily. I could  
>> prototype this
>> and submit a .patch if you are interested.
>>
>>  -Chase
>>
>> --
>> This message was sent on behalf of chase.seibert+opensubscriber at gmail.com 
>>  at
>> openSubscriber.com
>> http://www.opensubscriber.com/message/hibernate-dev@lists.jboss.org/9800518.html
>>
>>
>> _______________________________________________
>> hibernate-dev mailing list
>> hibernate-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>>
>>




More information about the hibernate-dev mailing list