[hibernate-dev] HSearch: Using sharding and avoiding query on multiple shards

Emmanuel Bernard emmanuel at hibernate.org
Wed Jul 30 20:36:13 EDT 2008


Today, in Hibernate Search, a query is applied on all shards. We use a  
MultiReader to wrap them together.
In some sharding scenario, it makes sense to apply the query on a  
single shard or a subset of the shards.

We could add the following API to IndexShardingStrategy

public DirectoryProvider<?>[]  
getDirectoryProvidersForQuery(o.a.l.search.Query query);

The query could be analyzed by the sharding strategy to detect boolean  
queries on their sharding criteria

//query building
BooleanQuery bQuery = new BooleanQuery();
bQuery.add(regularQuery, Occur.MUST);
bQuery.add( new TermQuery( new Term("distributor.id", "2"),  
Occur.MUST ); //only occurs in shard 1

public DirectoryProvider<?>[]  
getDirectoryProvidersForQuery(o.a.l.search.Query query) {
   if (query instanceof BooleanQuery) {
     List<BooleanClause> clauses =  
BooleanQuery.class.cast(query).clauses
   }
   int restrictedShard;
   boolean isAllMust = true;
   for (BooleanClause clause : clauses) {
     if (clause.getOccur() != Occur.MUST) { isAllMust = false; break; }
     if ( clause.getQuery() instanceof TermQuery ) {
       Term term = TermQuery.class.cast( clause.getQuery() ).getTerm();
       if (term.field().equals("distributor.id")) { restrictedShard =  
Integer.parse(term.text(); }
     }
   }
   if (isAllMust && restrictedShard != null) return new Provider[]  
{ providers[restrictedShard-1] };
   else return providers;
}


That's very flexibile but quite hard to implement correctly especially  
since the query tree structure might not be trivial

The alternative strategy is to have the following API on  
IndexShardingStrategy

public DirectoryProvider<?>[] getDirectoryProvidersForQuery(Object  
hint);

and a corresponding fullTextQuery.setShardHint(Object);

A query could "know it targets shard 2 and pass the information to the  
strategy through a standard language:

fullTextQuery.setShardHint("Sony");

public DirectoryProvider<?>[] getDirectoryProvidersForQuery(Object  
hint) {
   if (String.class.isInstance(hint) &&  
String.class.cast(hint).equals("Sony")) {
     return new Provider[] { providers[2] }
   }
   else {
     return providers;
   }
}

WDYT? How useful would that be?
--
Emmanuel Bernard
http://in.relation.to/Bloggers/Emmanuel | http://blog.emmanuelbernard.com 
  | http://twitter.com/emmanuelbernard
Hibernate Search in Action (http://is.gd/Dl1)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/hibernate-dev/attachments/20080730/69df77df/attachment.html 


More information about the hibernate-dev mailing list