Hi Sanne
On May 28, 2008, at 14:01, Sanne Grinovero wrote:
Hi Emmanuel,
(looking around the sources again) I am making several changes,
but I am feeling the urgency to ask your opinion about
"commit conventions"
1)is it ok in case of a "code revision" to have a commit of >100 files
for very trivial changes ( wrong formatting, spelling typos in
comments,
unused import removals);
do I have to warn you before commits of this type?
should I leave them all as-is?
If you do this massive work, you need to warn people in advance and do
it in a day. It essentially means people will have hard time to sync
up their current work while you do that.
If you cna't then, do it by small batches.
2)There are some "//TODO serialization" on classes that could
implement
Serializable trivially, is it ok if I just add them ?
(I'll don't do it for cases I have doubts)
I guess if I left a todo it was not so trivial at first sight and
would probably involve some transient readObject / writeObject work or
does not bring immediate usefulness. So be careful.
3)//TODO set a serial number
May I auto-generate a serialVersionUID for classes were missing,
or do you have some special policy about them I should be aware of?
I never understood that part of Java, you are free to go :)
4)final fields:
some constructor-initialized fields may need the final keyword,
especially
wen producing thread-safe transfer objects; I've added several
already,
I'm wondering now if it's ok to have this changes committed together
with 1) or if you would have preferred separated commits.
Also using "final" really helps me in better understanding the
overall code.
Separate is better, especially down the road for product maintenance.
5)static logger:
some classes (e.g. org.hibernate.search.engine.QueryLoader) "forget"
to declare the Logger as static, if this is intentional I would like
to know
more about it. May I fix them all if I find more?
Is it because of hot-redeploy? In this case I would need to remove
some statics;-)
http://www.slf4j.org/faq.html#declared_static
But for classes used a lot, I tend to keep the static modifier.
6)java.util.collections
there's a nice
List EMPTY_LIST = new ArrayList( 0 );
in QueryLoader; I would like to replace it with
java.util.Collections.EMPTY_LIST
but this has another effect: UnsupportedOperationException if someone
tries to add
elements. I think this is a good thing, but I could be wrong.
Also currently someone could reread data in other queries if adding
elements to
the ArrayList.
This exists only since java5, so it's not a good trick for core.
+1
7) the org.hibernate.search.Version class:
what's the idea for the new Date() ? to append the
execution time to version number?
May I move the getLogger in the static {} code block and release the
instance?
It's just to differentiate two snapshots and make sure they are
different from the final version. You can change the logger code if
you want
==
All above questions are about code I've already changed, just need
your
opinion before commit.
Now for suggested improvements:
A)defensive copy on public API (FindBugs):
FindBugs is complaining about
"May expose internal representation by incorporating reference to
mutable object"
in serveral places.
I suppose we can trust our own code, but for example in
FullTextQuery setProjection(String... fields)
it would be wise to make an own private copy of the data.
Too much paranoia? Also it would have some performance cost.
What do you want to copy? Projection is essentially a copy already
(unless you reference THIS but in this case you want the mutable
object). Otherwise, the method parameters are immutable, so this is
not a problem.
B) there's a STATIC field in SharedReaderProvider:
"static Field subReadersField"
it is looking quite suspect, is that really meant to be static?
Yes it's on purpose, it's a pointer to a Field object, no need to pay
the lookup more than once.
Also the ReaderData class contained in SharedReaderProvider
isn't looking as threadsafe; I could be wrong I only read
this class once.
I volunteer to examinate it better, also because of HSEARCH-194.
I 've started providing info on HSEARCH-194 but there is a lot of work
to do
AFAIK ReaderData is protected by locks it does not need to be thread-
safe
C)Workspace contains many maps to needed objects,
all having the same key "DirectoryProvider"; also locks are
mantained in an array; Woudn't it be easier to have a single map,
returning a container object with all needed stuff;
This object could be passed to all other methods instead of the
dir.Provider and avoid many more gets on the maps.
also some locking code could go "local" in this container
class, I've some difficulty in tracking the lock sequences.
Go ahead (open a JIRA issue), make sure it does no slow things down -
it shouldn't from what I've seen)
D)There's a TODO somewere about making a "readonly
IndexReader"; I guess to return to users for safety
(there is a comment in the docs:
"This indexReader must not be used for modification operations
(especially delete)")
I have implemented such an IndexReader, but for visibility problems
it needs to have package "org.apache.lucene.index" ;
what do you think? Should I kindly ask to lucene developers
to add it to their package?
It's a very simple wrapper around another reader, throwing
"unsupported"
on methods to be avoided.
Unfortunately the Lucene design lack Interfaces :)
I decided not to do the wrapper because I wanted to maximize
compatibility between different Lucene versions. But we can always
reconsider, this is not a top prioprity
Sanne
Thanks!