2011/8/11 Hardy Ferentschik <hardy(a)hibernate.org>:
On Thu, 11 Aug 2011 13:51:33 +0200, Sanne Grinovero
> This also means we could change the sharding strategy interface to:
> a - deal with Entity instances instead of org.apache.lucene.Document
> and string-encoded ids
Are there not valid use-cases where I want to make the decision using the
Document. What if I have a custom class bridge which I apply on all
in order to add a field to the document ((partly)independent of the entity
which is then used for sharding. I guess I could always write the same
in the ShardingStrategy!? Not sure.
I've thought about that same, and concluded than since the Document is
the output of some stateless transformation from the entity, I'm sure
that all what you could do in a custom bridge & sharding strategy
could be implemented in the sharding strategy alone, provided it gets
access to the entity. By definition the entity has more information,
not less, so it's definitely an added value.
Also if you had to change both code places to achieve your desired
sharding strategy, now you only need to do it once, in something
properly named "ShardingStrategy", instead of hacking around and
possibly instead of inserting extra unneeded tokens in the Document
for consumption of the ShardingStrategy.
> This would affect configuration: instead of configuring them on
> index name, an annotation should be placed on the type (or an optional
> parameter for existing @Indexed).
Are changing the sharding configuration and the ShardingStrategy really
coupled? We could change the way shards are configured and still keep
passing the Document to the sharding strategy, maybe in combination with
the actual entity. Wouldn't that give most flexibility?
As above, I don't think that will give you more functional options,
but maybe you're right that some operations might be easier: I'm
thinking of those cases in which the decisions is related to the
string-encoded form of some complex type, but also I don't see a
practical use of it, and nothing prevents me to re-encode it as needed
(other than repeating the encoding).
Still, not passing the Document provides you more advantages:
1 - we often felt the need of using a different representation, but
can't experiment with it as this is part of public API
2 - we would not be tight to build the Document before applying the
sharding logic: building the full Document often entails a lot of
work, not least loading extra entities for the sake of relations
indexing, and the strategy could decide to skip it. (I know I'm
trespassing in the area of another popular feature request which I
wouldn't implement this way, but as an example of kind of things we
can do with this added flexibility!)
> On top of a greater flexibility in sharding, but will also avoid
> exposing the o.a.l.Document yet in another API.
Is that such a bad thing?
Not extremely evil, but I hope above answers this.