Hi, I'm ready to commit the attached patch;
(Emmanuel and Hardy Please ignore the I sent last week)
This changes the Workspace pretty much
and overwrites the lucene backend (package
org.hibernate.search.backend.impl.lucene)
with a new implementation to use the new Workspace.
so I don't think this can be split more, as it all depends on the Workspace.
Unrelated changes are already on trunk:
Specifically, attached patch solves:
HSEARCH-272 Improve contention on DirectoryProviders in lucene backend
HSEARCH-263 Wrong analyzers used in IndexWriter
HSEARCH-271 Wrong Similarity used when sharing index among entities
and has some more benefits we can chat about if you like to.
There's one last change I'd like to do: changing the name from Workspace
to DirectoryWorkspace; I didn't because it would affect the OptimizerStrategy
interface but actually I just discovered this interface is not really public,
as it appears people can't provide their own strategy.
Should I do that?
Does someone have pending changes on org.hibernate.search.backend.impl.lucene ?
regards,
Sanne
2008/9/27 Sanne Grinovero <sanne.grinovero(a)gmail.com>:
Hello,
I managed to remove all dependencies from org.hibernate.search.backend
to any org.hibernate.search.backend.impl in the refactoring.
please let me know if you agree about committing this;
The pattern I used is very extensible and really decouples the code,
what I don't like from it is that it is "too clever": it's clean
and enables very nice compile-time checks but is a bit unusual.
Let me know if you prefer me to avoid this code style.
Hope this is ok, as I would like to move on to the nice improvements
which depend on this.
Sanne
2008/9/23 Sanne Grinovero <sanne.grinovero(a)gmail.com>:
> thanks, answers inline:
>
> 2008/9/23 Emmanuel Bernard <emmanuel(a)hibernate.org>:
>> It seems fine overall.
>> Unfortunately, your patch does not apply on trunk, so I read it through the
>> diff file.
> I'm sorry I shouldn't trust the eclipse plugin, we've had problems
> before with it.
>
>>
>> A couple of questions / remark:
>> - a Workspace is now limited to a single DirectoryProvider, correct?
> exactly
>
>> - why are getIndexReader / getIndexWriter and co synchronized? Is the
>> workspace supposed to be used by different threads? It seems it should not.
> Current backend doesn't need it, but this is the core for many optimizations.
> This is needed for the improvements to the backend I'm working on, in
> particular mass indexing.
> I think it isn't hurting currently,
> I just hope that when I'll show you the alternative backend you'll be
pleased;
> if not it's easy to just remove the synchronization and volatile.
> Please remember that sharing IndexWriter between committing transactions
> can have a huge scalability impact..
>
>> - LuceneWork (in o.h.s.backend) have a dependency on IndexInteractionType
>> (in o.h.s.backend.impl.lucene): this seems wrong.
>> In a way, a LuceneWork is kind of abstract from an actual Lucene work. It is
>> propagated to all backend implementations. For example an Optimize is not
>> propagated to JMS. So having implementation details like
>> IndexInteractionType in LuceneWork feels bad to me.
> Ok, I'll separate it using "visitor" pattern; expect a new patch soon.
>
>>
>> The rest looks fine.
>>
>> --
>> Emmanuel Bernard
>>
http://in.relation.to/Bloggers/Emmanuel |
http://blog.emmanuelbernard.com |
>>
http://twitter.com/emmanuelbernard
>> Hibernate Search in Action (
http://is.gd/Dl1)
>>
>> On Sep 22, 2008, at 04:38, Sanne Grinovero wrote:
>>
>>> Hello,
>>> I'm hoping you have some time to take a look on this; I think I should
>>> commit it
>>> but would like to hear your opinion before doing some disaster;-)
>>> I have more features built on this "base mod".
>>> Should I split it more? I think it's possible but it's getting hard
>>> (and a wast of time
>>> if you don't feel it's needed).
>>>
>>> Basically this does fix only HSEARCH-263 but is the base for many more
>>> improvements.
>>> Another change this does: a single thread never owns more than one D.P
>>> Lock.
>>> There's also the possibility to purgeAll() using an indexwriter, I
>>> left out other improvements
>>> for a later step but kept this one to "showcase" the design, meant
to
>>> enable a choice
>>> for each work to be applied on an IndexWriter or an IndexReader
>>> instead (when possible).
>>> Also this affects optimization: it is only performed as last operation
>>> (before it could be triggered
>>> in any point) and multiple optimizations will trigger just once.
>>>
>>> You'll notice the Workspace being converted to a single-directory
>>> workspace, (much simplification)
>>> and becoming threadsafe for later improvements.
>>>
>>> if you like it I'll commit to trunk, so I can begin fixing actual open
>>> issues using this as a starting point.
>>>
>>> regards,
>>> Sanne
>>>
>>> P.S. thanks hardy for the patch you attached to 225, this is exactly
>>> what I needed to know, I'll integrate it
>>> in a next patch.
>>> <Search-refactoring-step1.patch>
>>
>>
>