[hibernate-dev] [Search] Sharding and access to (subsets) of index readers and Lucene directories in HS 4.0

Sanne Grinovero sanne at hibernate.org
Tue Sep 6 07:05:01 EDT 2011


We need to take a decision regarding API changes to unlock the beta releases;
so we should take a decision on this subject.

Summarizing the last proposals:

1) Use a builder API, which defines
  - targeted entities
  - enabled shard-aware filters
    -- parameters on these filters

2) searchFactory.openReader(String ... index names)
pro: simple
con: hard to use: user has to duplicate logic of sharding strategies
and be aware of index names

3) searchFactory.openReader(Class[] entities,
ShardSensitiveOnlyFilter[] filterInstances)
pro: relatively simple
cons:
   - user has to initialize arrays
   - where does he get the filterInstances from? I guess by using
their constructor directly.

4) provide a ".openReader()" on the Query, basically reusing the query
object as the builder root and inheriting selected entities and
enabled filters.
the open question I had on this is where does it make more sense to
have it closed. My proposal was to allow invoking "close()" on the
returned reader itself.

My opinion:
  in the long term (for Hibernate Search 4.1?) provide both #1 and #2,
or both #4 and #2, as #2 provides more flexibility and  (#1 || #4) are
more practical.
  In the short term I'd provide #2 only for 4.0: IMHO the API makes
sense enough to be exposed to the power user and makes sure nobody is
prevented to use it while we have more time to think on the usability
details of #1 or #4.

(I don't like the signature of #3 but I'm not against it's simplicity
if somebody can rethink the method parameters)

what's your opinion?

Sanne



On 25 August 2011 17:56, Sanne Grinovero <sanne at hibernate.org> wrote:
>> As Emmanuel mentioned, can we think of use cases where we would like to
>> have access to Lucene Directories (/IndexManagers), which is currently
>> mentioned in the docs:
>> http://docs.jboss.org/hibernate/search/4.0/reference/en-US/html_single/#d0e6658
>> ?
>>
>> Elmer
>
> Yes that's an important question I would like we could answer before
> going to betas;
> Ideally I would like to remove "Directory" as a concept of Hibernate
> Search users, for various reasons; the whole concept to "identify an
> index" and then apply options / operations to it should go through an
> IndexManager, so that an application can replace the IndexManager
> implementation with one supporting writes to a remote cluster without
> having to change their code.
>
> Are there really use cases left in which we need direct access to a
> DirectoryProvider ?
> We won't of course prevent people from casting their IndexManager to a
> DirectoryBasedIndexManager, which exposes a getDirectoryProvider()
> mainly for testing reasons, but then I'd expect you know what you're
> doing.
>
>
> Getting back on topic; I had started another thread about Sharding, in
> which I suggested that nowadays sharding is coupled to an index name -
> at least as far as how it's defined in the configuration - while it
> should be coupled to an entity type.
> So I wouldn't say that we apply sharing to the index named
> "com.mystuff.indexName", but that I want to shard the entity
> "com.mystuff.Person".
>
> If we do, the IndexShardingStrategy interface would receive entity
> instances instead of o.a.l.Document instances, and it could be
> typesafe.
>
> I had no time to make a proof of concept, but I'd try to go in the
> following direction:
>
> IndexReader ir = searchFactory.indexReaders()
>    .forEntity(Car.class)
>        .enableFullTextFilter("colorsFilter") //must be a
> ShardSensitiveOnlyFilter
>            .setParameter("color","blue")
>    .forEntity(Animals.class)
>    .openIndexReader();
>
> searchFactory.indexReaders().closeIndexReader( ir );
>
>  I'm not totally happy with this idea, as
> 1) we could enforce the fact it's a ShardSensitiveOnlyFilter with some
> typesafety
> 2) The IndexShardingStrategy implementor has quite some work to do to
> collect all parameters back from the filters - I might like it more to
> return the IndexShardingStrategy and allow people to set options to it
> directly, but that's not an option as the strategy is used
> concurrently by other components.
>
> Proposal D - reusing the FullTextQuery as a proper context to define
> the target indexes
>
> FullTextQuery ftQuery = fullTextSession.createFullTextQuery( query,
> Driver.class, Animals.class );
> ftQuery.enableFullTextFilter( "bestDriver" );
> ftQuery.openIndexReader();
>
> Where to close it? I don't feel to like adding a closeIndexReader to
> FullTextQuery. We could better return a new kind of MultiReader which
> is able to close itself properly when invoking the close() method on
> the IndexReader itself.
>
> Sanne
>




More information about the hibernate-dev mailing list