Hey
On Apr 26, 2008, at 09:03, Sanne Grinovero wrote:
Hello most esteemed developers,
I have been scrutinizing the FSMasterDirectoryProvider and
FSSlaveDirectoryProvider
classes in Hibernate Search and have found some minor glitches and
space
for some improvement (IMHO); I would like to hear your opinion about
it,
and if you agree I'll open a JIRA and provide a patch myself ASAP.
A) Variable visibility
both FSMasterD.P. and FSSlaveD.P. have some little concurrency
problems:
there is a "volatile boolean inProgress" that fixes the
communication beetween
the "CopyDirectory" Runnable and the "TriggerTask" Timer, but all
comunication
beetween the client thread calling "getDirectory()" and the
CopyDirectory
which defines the current directory is unprotected; also volatile
isn't enough
to fix it, IMHO an AtomicReference would be more appropriate.
The locking schema looks great! I've had a hard time understanding
the flow
and did some tests about the behaviour of Lucene in case of concurrent
index-deletion and forcing the most dangerous situations I could
think of,
I couldn't find anything failing.
Actually in the current situation the lock works ok but the wrong
Directory
could be returned because of the visibility problems; the lock
avoids most
problems but could hurt scalability (as we could return locked
resources even
whenever an unlocked dir was available).
It's primarily because I did not see that as a huge problem.
You're right, we should use volatile for current but I don't see the
point of Atomic. Nobody but CopyDirectory update current. There is no
need for the atomic semantic I think.
In case of Master, the copy is protected by the DP lock
In case of Slave, we have a read-only Directory, the following
scenario can happen harmlessly
copy task run
grab a directory (#1)
copy task complete
switch to active #2
still using #1 for the reading task but it's ok
and an atomic property would not solve the problem.
B) The finalize / stop issue
Also I would like to address the use of finalize() to stop the Timer,
there is a TODO about it already; IMHO the best solution is to remove
the finalize without changing other code: the Timer is declared as
daemon thread already, so it's going to stop automatically on JVM
exit.
My point was to run cancel() but I guess one cannot have schedule non-
running tasks in our scenario.
So +1
The current finalize idea could harm the GC, as in the finalize we are
referencing
the timer and instruct the timer to do something: the timer has a
reference
to <this> so this would revive the reference to <this> and cancel the
garbage-collection;
actually I don't think this finalizer is ever being called, as the
timer
also has a reference to this (being an inner class) and is running
in another
thread which is certainly alive (as our purpose was to
stop it and we we still didn't get to that point).
So we could fix this issues just removing code, and making some
instance
variables (such as the Timer itself) unnecessary.
Also this means you won't need a stop() to be added to
DirectoryProvider interface,
at least not for FSMasterD.P. and FSSlaveD.P. ( which are the only
known implementations
needing it, so we could also remove the TODO in the DirectoryProvider.
C) FileHelper not listening to Thread.interrupt() requests
As FileHelper.synchronize is used in the mentioned Timers to copy
potentially big files,
it would be nice to check for Thread.isInterrupted() and also catch
ClosedByInterruptException in file-copy operations so to abort all I/O
operations and
not just the current one to abort faster on JVM shutdown.
(It doesn't look like the copy safety was guaranteed anyway).
+1
D) Master and Slave share configuration properties.
Wouldn't it be natural to have two different property keys to manage
the shared index path? I imagine the typical scenario would be to
have the two
configured on different servers and copy through some NFS, but it
looks like
we are forcing people to use exactly the same path on both machines
as the two
classes read the same props.
If you are running both on the same EAR ie the same vm, we need a much
simpler architecture anyway.
If you are not, then you need a particular deployment for your MDB so
the properties will change.