Hi Chase,
the problem I see is that if you have 3 customers having ids 34,35,202
you'll have to define 202 indexes
but you're right it should be a separate change.
The documentation of Hibernate Search is built from xml files
contained in the sources, you'll find them in
/src/main/docbook/en-US/modules
Using maven for the build you'll get also all documentation built so
you can see how what you write will look like in both PDF and HTML
forms.
Here are the detailed instructions:
Sanne,
See attached example of a IndexShardingStrategy and unit test. I was able to
satisfy my shading needs without named shards, though I would be open to
implementing that separately.
In CustomerShardingStrategy, the index is broken into one shard per
customer. Each Document contains a customerID field, which is used as the
index for a local DirectoryProvider array. The number of shards should be at
least the max customerID. The implementation of
getDirectoryProvidersForQuery() looks for a Filter called "customer" that
also contains a customerID parameter, and returns the single
DirectoryProvider for that customer.
What can I do to aid the documentation effort? Thanks,
-Chase
On Tue, Jun 9, 2009 at 2:54 PM, Sanne Grinovero <sanne.grinovero(a)gmail.com>
wrote:
>
> Hi Chase,
> sorry for the late answer; I've just looked at your code, it looks
> very good and I'd like to apply this patch if Emmanuel and Hardy
> agree?
>
> There are no tests in your patch to verify this is actually useful, do
> you have a good example of a ShardingProvider using it?
> (tests are not only used to test your code but also serve as examples
> and concept demos).
> Don't you think the DirectoryProvider names should be exposed, or are
> you able to create a nice sharding implementation without needing
> that?
>
> If you could add a testcase and documentation updates it would be even
> better and speed up the work ;-)
>
> Sanne
>
> 2009/6/8 Chase Seibert <chase.seibert(a)gmail.com>:
> > Sanne,
> >
> > Did you get a change to look at this? Thanks,
> >
> > -Chase
> >
> >
> > On Wed, Jun 3, 2009 at 4:08 PM, Chase Seibert <chase.seibert(a)gmail.com>
> > wrote:
> >>
> >> Sanne,
> >>
> >> I have implemented your suggestion for IndexShardingStrategy to
> >> optionally
> >> provide a set of DirectoryProviders BEFORE the search based on one or
> >> more
> >> FullTextFilters. Using this change, I was able to optimize my specific
> >> case
> >> to search only hitting the relevant shards.
> >>
> >> I have not yet implemented your labeled shard idea, nor your shard on
> >> enum
> >> idea. If we can agree on this change first, I think I can implement
> >> those on
> >> top of this.
> >>
> >> Please see attached svn .patch (diff) file. I have tested the patch on
> >> 3.1.1 and 3.2.0. Any feedback is welcome.
> >>
> >> -Chase
> >>
> >>
> >> On Wed, Jun 3, 2009 at 1:27 PM, Sanne Grinovero
> >> <sanne.grinovero(a)gmail.com> wrote:
> >>>
> >>> I am having a similar need in these days; this should be a very useful
> >>> feature, but I'd like more something I could use with the existing
API
> >>> like
> >>>
> >>> enableFullTextFilter( "MyShardsSelectionStrategy"
).setParameter( ...
> >>> )
> >>>
> >>> a practical example:
> >>> enableFullTextFilter( "LanguageFilter" ).setParameter(
"IT-it" )
> >>>
> >>> The existing IndexShardingStrategy should be able to be smarter and
> >>> have something like
> >>>
> >>> DirectoryProvider<?>[] getDirectoryProvidersForQuery( filters
&&
> >>> options
> >>> )
> >>>
> >>> So a smart ShardingStrategy could do some selections considering this.
> >>>
> >>> I'm currently using sharding to shard my index on 25 different
> >>> languages (using per-language stemmers), so this would
> >>> be useful but I'd especially need to be able to "label" my
different
> >>> DirectoryProviders using String identifiers,
> >>> I'd suggest to add a getName() to the DirectoryProvider interface:
I
> >>> would use that to store countrycodes and
> >>> keep a map<String,DirectoryProvider> in my ShardingStrategy, so I
can
> >>> easily select the right DP when
> >>> the LanguageFilter is enabled.
> >>>
> >>> Another usage would be to shard an entity on an Enumerated property:
> >>> in this case an appropriate ShardingStrategy
> >>> could be provided by Search and auto-configured by reading the
> >>> possible enum values: that would be a very easy way
> >>> to enable sharding on an entity.
> >>>
> >>> Sanne
> >>>
> >>> 2009/6/3 Emmanuel Bernard <emmanuel(a)hibernate.org>:
> >>> >
> >>> >
> >>> > Begin forwarded message:
> >>> >
> >>> > From: chase.seibert+opensubscriber(a)gmail.com
> >>> > Date: June 3, 2009 09:21:21 PDT
> >>> > To: emmanuel(a)hibernate.org
> >>> > Subject: Re: Re: [hibernate-dev] HSearch: Using sharding and
> >>> > avoiding
> >>> > query
> >>> > on multiple shards
> >>> > Reply-To: chase.seibert+opensubscriber(a)gmail.com
> >>> > Emmanuel,
> >>> >
> >>> > Regarding HSEARCH-251, and
> >>> >
> >>> >
> >>> >
http://www.opensubscriber.com/message/hibernate-dev@lists.jboss.org/97703...
> >>> >
> >>> > Being able to query just a single shard or subset of shards would
be
> >>> > awesome. I was thinking of a similar API:
> >>> >
> >>> > IndexShardingStrategy:
> >>> > public DirectoryProvider<?>[]
> >>> > getDirectoryProviderForShard(int shardNum);
> >>> >
> >>> > FullTextQuery:
> >>> > public void enableShardFilter(int shardNum);
> >>> > public void enableShardFilters(int[] shardNums);
> >>> >
> >>> > FullTextQuery.buildSearcher() would need to be modified to call
> >>> > getDirectoryProviderForShard() for each shardNum if shardNums are
> >>> > set,
> >>> > otherwise it should continue to use
> >>> > getDirectoryProvidersForAllShards();
> >>> >
> >>> > Calling this API from a consumer's stand-point would look
like:
> >>> > FullTextQuery fullTextQuery =
> >>> > fullTextSession.createFullTextQuery(luceneQuery, entityClass);
> >>> > fullTextQuery.enableShardFilter(5);
> >>> > fullTextQuery.list();
> >>> >
> >>> > This could be changed to pass named shards easily. I could
prototype
> >>> > this
> >>> > and submit a .patch if you are interested.
> >>> >
> >>> > -Chase
> >>> >
> >>> > --
> >>> > This message was sent on behalf of
> >>> > chase.seibert+opensubscriber(a)gmail.com at
> >>> >
openSubscriber.com
> >>> >
> >>> >
> >>> >
http://www.opensubscriber.com/message/hibernate-dev@lists.jboss.org/98005...
> >>> >
> >>> >
> >>> > _______________________________________________
> >>> > hibernate-dev mailing list
> >>> > hibernate-dev(a)lists.jboss.org
> >>> >
https://lists.jboss.org/mailman/listinfo/hibernate-dev
> >>> >
> >>> >
> >>
> >
> >