[hibernate-dev] Search: replacing lucene backend

Sanne Grinovero sanne.grinovero at gmail.com
Fri Oct 3 07:45:47 EDT 2008


Hi, I'm ready to commit the attached patch;
(Emmanuel and Hardy Please ignore the I sent last week)

This changes the Workspace pretty much
and overwrites the lucene backend (package
org.hibernate.search.backend.impl.lucene)
with a new implementation to use the new Workspace.
so I don't think this can be split more, as it all depends on the Workspace.

Unrelated changes are already on trunk:
http://fisheye.jboss.com/changelog/Hibernate/search?cs=15224

Specifically, attached patch solves:
HSEARCH-272   	 Improve contention on DirectoryProviders in lucene backend
HSEARCH-263   	 Wrong analyzers used in IndexWriter
HSEARCH-271   	 Wrong Similarity used when sharing index among entities
and has some more benefits we can chat about if you like to.

There's one last change I'd like to do: changing the name from Workspace
to DirectoryWorkspace; I didn't because it would affect the OptimizerStrategy
interface but actually I just discovered this interface is not really public,
as it appears people can't provide their own strategy.

Should I do that?

Does someone have pending changes on org.hibernate.search.backend.impl.lucene ?

regards,
Sanne

2008/9/27 Sanne Grinovero <sanne.grinovero at gmail.com>:
> Hello,
> I managed to remove all dependencies from org.hibernate.search.backend
> to any org.hibernate.search.backend.impl in the refactoring.
>
> please let me know if you agree about committing this;
> The pattern I used is very extensible and really decouples the code,
> what I don't like from it is that it is "too clever": it's clean
> and enables very nice compile-time checks but is a bit unusual.
> Let me know if you prefer me to avoid this code style.
>
> Hope this is ok, as I would like to move on to the nice improvements
> which depend on this.
>
> Sanne
>
> 2008/9/23 Sanne Grinovero <sanne.grinovero at gmail.com>:
>> thanks, answers inline:
>>
>> 2008/9/23 Emmanuel Bernard <emmanuel at hibernate.org>:
>>> It seems fine overall.
>>> Unfortunately, your patch does not apply on trunk, so I read it through the
>>> diff file.
>> I'm sorry I shouldn't trust the eclipse plugin, we've had problems
>> before with it.
>>
>>>
>>> A couple of questions / remark:
>>>  - a Workspace is now limited to a single DirectoryProvider, correct?
>> exactly
>>
>>>  - why are getIndexReader / getIndexWriter and co synchronized? Is the
>>> workspace supposed to be used by different threads? It seems it should not.
>> Current backend doesn't need it, but this is the core for many optimizations.
>> This is needed for the improvements to the backend I'm working on, in
>> particular mass indexing.
>> I think it isn't hurting currently,
>> I just hope that when I'll show you the alternative backend you'll be pleased;
>> if not it's easy to just remove the synchronization and volatile.
>> Please remember that sharing IndexWriter between committing transactions
>> can have a huge scalability impact..
>>
>>>  - LuceneWork (in o.h.s.backend) have a dependency on IndexInteractionType
>>> (in o.h.s.backend.impl.lucene): this seems wrong.
>>> In a way, a LuceneWork is kind of abstract from an actual Lucene work. It is
>>> propagated to all backend implementations. For example an Optimize is not
>>> propagated to JMS. So having implementation details like
>>> IndexInteractionType in LuceneWork feels bad to me.
>> Ok, I'll separate it using "visitor" pattern; expect a new patch soon.
>>
>>>
>>> The rest looks fine.
>>>
>>> --
>>> Emmanuel Bernard
>>> http://in.relation.to/Bloggers/Emmanuel | http://blog.emmanuelbernard.com |
>>> http://twitter.com/emmanuelbernard
>>> Hibernate Search in Action (http://is.gd/Dl1)
>>>
>>> On  Sep 22, 2008, at 04:38, Sanne Grinovero wrote:
>>>
>>>> Hello,
>>>> I'm hoping you have some time to take a look on this; I think I should
>>>> commit it
>>>> but would like to hear your opinion before doing some disaster;-)
>>>> I have more features built on this "base mod".
>>>> Should I split it more? I think it's possible but it's getting hard
>>>> (and a wast of time
>>>> if you don't feel it's needed).
>>>>
>>>> Basically this does fix only HSEARCH-263 but is the base for many more
>>>> improvements.
>>>> Another change this does: a single thread never owns more than one D.P
>>>> Lock.
>>>> There's also the possibility to purgeAll() using an indexwriter, I
>>>> left out other improvements
>>>> for a later step but kept this one to "showcase" the design, meant to
>>>> enable a choice
>>>> for each work to be applied on an IndexWriter or an IndexReader
>>>> instead (when possible).
>>>> Also this affects optimization: it is only performed as last operation
>>>> (before it could be triggered
>>>> in any point) and multiple optimizations will trigger just once.
>>>>
>>>> You'll notice the Workspace being converted to a single-directory
>>>> workspace, (much simplification)
>>>> and becoming threadsafe for later improvements.
>>>>
>>>> if you like it I'll commit to trunk, so I can begin fixing actual open
>>>> issues using this as a starting point.
>>>>
>>>> regards,
>>>> Sanne
>>>>
>>>> P.S. thanks hardy for the patch you attached to 225, this is exactly
>>>> what I needed to know, I'll integrate it
>>>> in a next patch.
>>>> <Search-refactoring-step1.patch>
>>>
>>>
>>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Search-refactoring.patch
Type: text/x-patch
Size: 62460 bytes
Desc: not available
Url : http://lists.jboss.org/pipermail/hibernate-dev/attachments/20081003/d84be6bd/attachment.bin 


More information about the hibernate-dev mailing list