[hibernate-dev] Re: Hibernate Search: Master/Slave DirectoryProviders improvements proposal

Emmanuel Bernard emmanuel at hibernate.org
Mon Apr 28 14:16:17 EDT 2008


Hey

On  Apr 26, 2008, at 09:03, Sanne Grinovero wrote:

> Hello most esteemed developers,
> I have been scrutinizing the FSMasterDirectoryProvider and
> FSSlaveDirectoryProvider
> classes in Hibernate Search and have found some minor glitches and  
> space
> for some improvement (IMHO); I would like to hear your opinion about  
> it,
> and if you agree I'll open a JIRA and provide a patch myself ASAP.
>
> A) Variable visibility
> both FSMasterD.P. and FSSlaveD.P. have some little concurrency  
> problems:
> there is a "volatile boolean inProgress" that fixes the  
> communication beetween
> the "CopyDirectory" Runnable and the "TriggerTask" Timer, but all  
> comunication
> beetween the client thread calling "getDirectory()" and the  
> CopyDirectory
> which defines the current directory is unprotected; also volatile  
> isn't enough
> to fix it, IMHO an AtomicReference would be more appropriate.
>  The locking schema looks great! I've had a hard time understanding  
> the flow
> and did some tests about the behaviour of Lucene in case of concurrent
> index-deletion and forcing the most dangerous situations I could  
> think of,
> I couldn't find anything failing.
>  Actually in the current situation the lock works ok but the wrong  
> Directory
> could be returned because of the visibility problems; the lock  
> avoids most
> problems but could hurt scalability (as we could return locked  
> resources even
> whenever an unlocked dir was available).

It's primarily because I did not see that as a huge problem.
You're right, we should use volatile for current but I don't see the  
point of Atomic. Nobody but CopyDirectory update current. There is no  
need for the atomic semantic I think.
In case of Master, the copy is protected by the DP lock
In case of Slave, we have a read-only Directory, the following  
scenario can happen harmlessly

copy task run
grab a directory (#1)
copy task complete
switch to active #2
still using #1 for the reading task but it's ok

and an atomic property would not solve the problem.


>
>
> B) The finalize / stop issue
> Also I would like to address the use of finalize() to stop the Timer,
> there is a TODO about it already; IMHO the best solution is to remove
> the finalize without changing other code: the Timer is declared as
> daemon thread already, so it's going to stop automatically on JVM  
> exit.

My point was to run cancel() but I guess one cannot have schedule non- 
running tasks in our scenario.
So +1

>
> The current finalize idea could harm the GC, as in the finalize we are
> referencing
> the timer and instruct the timer to do something: the timer has a  
> reference
> to <this> so this would revive the reference to <this> and cancel the
> garbage-collection;
> actually I don't think this finalizer is ever being called, as the  
> timer
> also has a reference to this (being an inner class) and is running  
> in another
> thread which is certainly alive (as our purpose was to
> stop it and we we still didn't get to that point).
>
> So we could fix this issues just removing code, and making some  
> instance
> variables (such as the Timer itself) unnecessary.
> Also this means you won't need a stop() to be added to
> DirectoryProvider interface,
> at least not for FSMasterD.P. and FSSlaveD.P. ( which are the only
> known implementations
> needing it, so we could also remove the TODO in the DirectoryProvider.
>
> C) FileHelper not listening to Thread.interrupt() requests
> As FileHelper.synchronize is used in the mentioned Timers to copy
> potentially big files,
> it would be nice to check for Thread.isInterrupted() and also catch
> ClosedByInterruptException in file-copy operations so to abort all I/O
> operations and
> not just the current one to abort faster on JVM shutdown.
> (It doesn't look like the copy safety was guaranteed anyway).

+1

>
>
> D) Master and Slave share configuration properties.
> Wouldn't it be natural to have two different property keys to manage
> the shared index path? I imagine the typical scenario would be to  
> have the two
> configured on different servers and copy through some NFS, but it  
> looks like
> we are forcing people to use exactly the same path on both machines  
> as the two
> classes read the same props.

If you are running both on the same EAR ie the same vm, we need a much  
simpler architecture anyway.
If you are not, then you need a particular deployment for your MDB so  
the properties will change.




More information about the hibernate-dev mailing list