[hibernate-dev] Re: Hibernate Search: Master/Slave DirectoryProviders improvements proposal

Monday, 28 April 2008

Hey

On  Apr 26, 2008, at 09:03, Sanne Grinovero wrote:

...
 Hello most esteemed developers,
 I have been scrutinizing the FSMasterDirectoryProvider and
 FSSlaveDirectoryProvider
 classes in Hibernate Search and have found some minor glitches and  
 space
 for some improvement (IMHO); I would like to hear your opinion about  
 it,
 and if you agree I'll open a JIRA and provide a patch myself ASAP.

 A) Variable visibility
 both FSMasterD.P. and FSSlaveD.P. have some little concurrency  
 problems:
 there is a "volatile boolean inProgress" that fixes the  
 communication beetween
 the "CopyDirectory" Runnable and the "TriggerTask" Timer, but all  
 comunication
 beetween the client thread calling "getDirectory()" and the  
 CopyDirectory
 which defines the current directory is unprotected; also volatile  
 isn't enough
 to fix it, IMHO an AtomicReference would be more appropriate.
  The locking schema looks great! I've had a hard time understanding  
 the flow
 and did some tests about the behaviour of Lucene in case of concurrent
 index-deletion and forcing the most dangerous situations I could  
 think of,
 I couldn't find anything failing.
  Actually in the current situation the lock works ok but the wrong  
 Directory
 could be returned because of the visibility problems; the lock  
 avoids most
 problems but could hurt scalability (as we could return locked  
 resources even
 whenever an unlocked dir was available). 
It's primarily because I did not see that as a huge problem.
You're right, we should use volatile for current but I don't see the  
point of Atomic. Nobody but CopyDirectory update current. There is no  
need for the atomic semantic I think.
In case of Master, the copy is protected by the DP lock
In case of Slave, we have a read-only Directory, the following  
scenario can happen harmlessly

copy task run
grab a directory (#1)
copy task complete
switch to active #2
still using #1 for the reading task but it's ok

and an atomic property would not solve the problem.

...

 B) The finalize / stop issue
 Also I would like to address the use of finalize() to stop the Timer,
 there is a TODO about it already; IMHO the best solution is to remove
 the finalize without changing other code: the Timer is declared as
 daemon thread already, so it's going to stop automatically on JVM  
 exit. 
My point was to run cancel() but I guess one cannot have schedule non- 
running tasks in our scenario.
So +1

...

 The current finalize idea could harm the GC, as in the finalize we are
 referencing
 the timer and instruct the timer to do something: the timer has a  
 reference
 to <this> so this would revive the reference to <this> and cancel the
 garbage-collection;
 actually I don't think this finalizer is ever being called, as the  
 timer
 also has a reference to this (being an inner class) and is running  
 in another
 thread which is certainly alive (as our purpose was to
 stop it and we we still didn't get to that point).

 So we could fix this issues just removing code, and making some  
 instance
 variables (such as the Timer itself) unnecessary.
 Also this means you won't need a stop() to be added to
 DirectoryProvider interface,
 at least not for FSMasterD.P. and FSSlaveD.P. ( which are the only
 known implementations
 needing it, so we could also remove the TODO in the DirectoryProvider.

 C) FileHelper not listening to Thread.interrupt() requests
 As FileHelper.synchronize is used in the mentioned Timers to copy
 potentially big files,
 it would be nice to check for Thread.isInterrupted() and also catch
 ClosedByInterruptException in file-copy operations so to abort all I/O
 operations and
 not just the current one to abort faster on JVM shutdown.
 (It doesn't look like the copy safety was guaranteed anyway). 
+1

...

 D) Master and Slave share configuration properties.
 Wouldn't it be natural to have two different property keys to manage
 the shared index path? I imagine the typical scenario would be to  
 have the two
 configured on different servers and copy through some NFS, but it  
 looks like
 we are forcing people to use exactly the same path on both machines  
 as the two
 classes read the same props. 
If you are running both on the same EAR ie the same vm, we need a much  
simpler architecture anyway.
If you are not, then you need a particular deployment for your MDB so  
the properties will change.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[hibernate-dev] Re: Hibernate Search: Master/Slave DirectoryProviders improvements proposal