[hibernate-dev] difficulties in upgrading to Lucene 3

Sanne Grinovero sanne.grinovero at gmail.com
Mon Jan 11 12:16:22 EST 2010


Nice, done: depending on Lucene 2.9.1 right now.

Sanne

2010/1/11 Emmanuel Bernard <emmanuel at hibernate.org>:
> Right, go forward and temporarily break COMPRESS.
> Option B (ie use CompressionTools) seems like a better option than getting rid of it altogether.
> I think we should offer a pluggable compression method but I would not mind it being global and thus not require change to the enum.
>
> On 11 janv. 2010, at 00:39, Sanne Grinovero wrote:
>
>> Hello again,
>> it got better I've a simple patch ready to migrate to 2.9.1, all tests green;
>> it was easier than I initially thought to fix the new DocIdSet
>> shall I apply it?
>> We can then think about the compression issue and after that move 3.0.x
>>
>> I've moved HSEARCH-424 "Update to Lucene 3.0" from beta2 to beta3
>> as beta3 is containing already stuff like HSEARCH-415 "Consider moving
>> to Lucene 2.9"
>> (didn't have sense the other way around)
>>
>> Cheers,
>> Sanne
>>
>> 2010/1/10 Sanne Grinovero <sanne.grinovero at gmail.com>:
>>> Hello all,
>>> I've been thinking about the strategy to upgrade to Lucene 3;
>>> Ignoring new features at the moment, the main issues in migration:
>>>
>>> 1- Store.COMPRESS not supported anymore
>>> 2- Some Analyzers and the QueryParser require an additional
>>> constructor parameter
>>> 3- DocIdSet interface (used in filters) changed - changed even in
>>> Lucene 2.9.x, making step-by-step migration harder
>>>
>>> While point 2 is not a great problem (I'm having a patch ready);
>>> points 1 and 3 are connected: DocIdSet must be solved as soon as we
>>> move to 2.9, while 2.9 is a requirement to implement COMPRESS in a
>>> different way if we choose to:
>>> It appears we can't maintain binary index compatibility, but
>>> supporting the feature is an option.
>>> Lucene 3 will transparently decompress an old-style compressed field
>>> when reading it and it will even decompress all fields during
>>> optimization, effectively transforming the index to the new format.
>>>
>>> If we want to still support the contract of
>>> org.hibernate.search.annotations.Store.COMPRESS we will have to
>>> compress ourself the field, possibly using a pluggable strategy;
>>> assuming the use of org.apache.lucene.document.CompressionTools as
>>> default implementation we can provide a backwards compatible-API but
>>> the resulting index is going to have a different format.
>>>
>>> A future improvement could be to use any external
>>> compression/decompression function (user provided implementation), any
>>> idea where? Maybe replace the Store enum with an interface?
>>>
>>> What should I do to solve HSEARCH-425 ?
>>> The options I've considered so far:
>>>
>>> A) Deprecate the Store.COMPRESS, without providing an alternative
>>>
>>> B) Change implementation to make use of Lucene's CompressionTools
>>>
>>> CompressionTools only exist since Lucene 2.9, so an upgrade is
>>> mandatory but other features are going to break, like filters
>>> (org.hibernate.search.filter.AndDocIdSet needs to implement an updated
>>> interface)
>>> So basically I'll need a branch, break some tests temporarily, or
>>> provide a single huge patch refactoring some features and tests at
>>> same time :-/
>>> An alternative to branching would be to solve the Compress issue
>>> later, and focus on the build breaking changes first; in practice this
>>> would break the compression feature until it's fixed, but this
>>> shouldn't be a great problem as it going to change anyway...
>>>
>>> WDYT?
>>>
>>> I'm working on the new DocIdSet, even that will be a considerable change.
>>>
>>> Cheers,
>>> Sanne
>>>
>> _______________________________________________
>> hibernate-dev mailing list
>> hibernate-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>
>



More information about the hibernate-dev mailing list