[hibernate-dev] Re: Hibernate Search: Master/Slave DirectoryProviders improvements proposal
Emmanuel Bernard
emmanuel at hibernate.org
Mon Apr 28 14:16:17 EDT 2008
Hey
On Apr 26, 2008, at 09:03, Sanne Grinovero wrote:
> Hello most esteemed developers,
> I have been scrutinizing the FSMasterDirectoryProvider and
> FSSlaveDirectoryProvider
> classes in Hibernate Search and have found some minor glitches and
> space
> for some improvement (IMHO); I would like to hear your opinion about
> it,
> and if you agree I'll open a JIRA and provide a patch myself ASAP.
>
> A) Variable visibility
> both FSMasterD.P. and FSSlaveD.P. have some little concurrency
> problems:
> there is a "volatile boolean inProgress" that fixes the
> communication beetween
> the "CopyDirectory" Runnable and the "TriggerTask" Timer, but all
> comunication
> beetween the client thread calling "getDirectory()" and the
> CopyDirectory
> which defines the current directory is unprotected; also volatile
> isn't enough
> to fix it, IMHO an AtomicReference would be more appropriate.
> The locking schema looks great! I've had a hard time understanding
> the flow
> and did some tests about the behaviour of Lucene in case of concurrent
> index-deletion and forcing the most dangerous situations I could
> think of,
> I couldn't find anything failing.
> Actually in the current situation the lock works ok but the wrong
> Directory
> could be returned because of the visibility problems; the lock
> avoids most
> problems but could hurt scalability (as we could return locked
> resources even
> whenever an unlocked dir was available).
It's primarily because I did not see that as a huge problem.
You're right, we should use volatile for current but I don't see the
point of Atomic. Nobody but CopyDirectory update current. There is no
need for the atomic semantic I think.
In case of Master, the copy is protected by the DP lock
In case of Slave, we have a read-only Directory, the following
scenario can happen harmlessly
copy task run
grab a directory (#1)
copy task complete
switch to active #2
still using #1 for the reading task but it's ok
and an atomic property would not solve the problem.
>
>
> B) The finalize / stop issue
> Also I would like to address the use of finalize() to stop the Timer,
> there is a TODO about it already; IMHO the best solution is to remove
> the finalize without changing other code: the Timer is declared as
> daemon thread already, so it's going to stop automatically on JVM
> exit.
My point was to run cancel() but I guess one cannot have schedule non-
running tasks in our scenario.
So +1
>
> The current finalize idea could harm the GC, as in the finalize we are
> referencing
> the timer and instruct the timer to do something: the timer has a
> reference
> to <this> so this would revive the reference to <this> and cancel the
> garbage-collection;
> actually I don't think this finalizer is ever being called, as the
> timer
> also has a reference to this (being an inner class) and is running
> in another
> thread which is certainly alive (as our purpose was to
> stop it and we we still didn't get to that point).
>
> So we could fix this issues just removing code, and making some
> instance
> variables (such as the Timer itself) unnecessary.
> Also this means you won't need a stop() to be added to
> DirectoryProvider interface,
> at least not for FSMasterD.P. and FSSlaveD.P. ( which are the only
> known implementations
> needing it, so we could also remove the TODO in the DirectoryProvider.
>
> C) FileHelper not listening to Thread.interrupt() requests
> As FileHelper.synchronize is used in the mentioned Timers to copy
> potentially big files,
> it would be nice to check for Thread.isInterrupted() and also catch
> ClosedByInterruptException in file-copy operations so to abort all I/O
> operations and
> not just the current one to abort faster on JVM shutdown.
> (It doesn't look like the copy safety was guaranteed anyway).
+1
>
>
> D) Master and Slave share configuration properties.
> Wouldn't it be natural to have two different property keys to manage
> the shared index path? I imagine the typical scenario would be to
> have the two
> configured on different servers and copy through some NFS, but it
> looks like
> we are forcing people to use exactly the same path on both machines
> as the two
> classes read the same props.
If you are running both on the same EAR ie the same vm, we need a much
simpler architecture anyway.
If you are not, then you need a particular deployment for your MDB so
the properties will change.
More information about the hibernate-dev
mailing list