[infinispan-dev] Feedback on Infinispan patch

Jason T. Greene jason.greene at redhat.com
Thu Sep 24 17:17:09 EDT 2009


Jeff is correct, the FSF has even gone out of their way to address the 
ASF's stretched concern:

http://www.fsf.org/licensing/licenses/lgpl-java.html

While the ASF insists the legal issue is still legitimate, it certainly 
smells more like a political decision to push adoption of the ASL.

Jeff Ramsdale wrote:
> Untrue. Apache-licensed projects are certainly able to include LGPL
> components. However, the Apache FOUNDATION does not allow LPGL
> components in its own projects. This is not a limitation of the Apache
> license, this is a matter of Foundation policy. As Compass is not a
> product of the Apache Foundation it shouldn't be an issue (unless, of
> course, they have their own policy against including LGPL components).
> 
> -jeff
> 
> On Thu, Sep 24, 2009 at 1:24 PM, Sanne Grinovero
> <sanne.grinovero at gmail.com> wrote:
>> Compass is Apache License, as Lucene is.
>> Some Apache members would love to have this inside Lucene, but it
>> appears to be a no-go because of the dependency on Infinispan
>> (incompatible license); so either Compass can't have it either or it
>> could be included into Lucene.
>>
>> There wouldn't be any problems if the code could depend on an Apache
>> licensed API... if it's not out of question here to split into
>> API/Impl.
>>
>> Sanne
>>
>> 2009/9/24 Manik Surtani <manik at jboss.org>:
>>> On 24 Sep 2009, at 17:36, Jeff Ramsdale wrote:
>>>
>>>> Another alternative would be to see if the Compass Project would be
>>>> interested in hosting it: http://www.compass-project.org/
>>> Good idea.  Anybody have traction with the compass folk to propose this?
>>>
>>>> Even if Infinispan ends up hosting it there might be value in doing
>>>> some cross-pollination with the Compass folks since this aligns
>>>> directly with what they are working on.
>>>>
>>>> -jeff
>>>>
>>>> 2009/9/24 Manik Surtani <manik at jboss.org>:
>>>>> Minorly off topic, but rather than working with patches, do we want
>>>>> this
>>>>> Directory impl in source control somewhere?
>>>>> Being dependent on LGPL, it won't be accepted into Lucene's
>>>>> contribs.  If it
>>>>> doesn't depend on any Hibernate Search code, I could host it in
>>>>> Infinispan's
>>>>> SVN repo...
>>>>>
>>>>> On 23 Sep 2009, at 13:58, Łukasz Moreń wrote:
>>>>>
>>>>> I agree that Infinispan case is not much different from
>>>>> RamDirectory. The
>>>>> major difference is that in RD (also FileDirectory) changes are not
>>>>> batched
>>>>> like in ID. If I do not wrap changes in InfinispanDirectory(simple
>>>>> remove
>>>>> tx.begin() from obtain() method and tx.commit() from release() in
>>>>> InfinispanLock), and immediately commit every change made by IW it
>>>>> works
>>>>> well. Hovewer it makes indexing really slower, because of frequent
>>>>> replication to other nodes.
>>>>> Sanne it's good remark that IW commit is kind of flush.
>>>>>
>>>>> I've attached patch with InfinispanDirectory, failing test is
>>>>> testDirectoryWithMultipleThreads in InfinispanDirectoryTest class.
>>>>> It fails
>>>>> randomly. I think problem is Infinispan commit on lockRelease() in
>>>>> org.apache.lucene.index.IndexWriter (line 1658) is after IW commit
>>>>> () (line
>>>>> 1654).
>>>>>
>>>>>> Is it because, the IndexWriter only clean files if no indexReaders
>>>>>> are
>>>>>> reading them (how would that be detected)?
>>>>> It can happen if IndexWriter clean file, and IndexReader try to
>>>>> access that
>>>>> cleaned file.
>>>>>
>>>>> 2009/9/23 Sanne Grinovero <sanne.grinovero at gmail.com>
>>>>>> I agree It should work the same way; The IndexWriter cleans files
>>>>>> whenever it likes to, it doesn't try to detect readers, and this
>>>>>> shouldn't have any effect on the working of readers.
>>>>>> The IndexReader opens the "SegmentsInfo" first, and immediately
>>>>>> after** gets a reference to the segments listed in this
>>>>>> SegmentsInfo.
>>>>>> No IndexWriter will ever change an existing segment, only add new
>>>>>> files or eventually delete old ones (segments merge,optimize).
>>>>>> The deletion of segments is the interesting subject: when using
>>>>>> Files
>>>>>> it uses "delete at last close", which works because the IR needing
>>>>>> it
>>>>>> have it opened already**; when using the RAMDirectory they have a
>>>>>> reference preventing garbage collection.
>>>>>>
>>>>>> ( the two "**" are assuming the same event occurred correctly,
>>>>>> otherwise an exception is thrown at opening)
>>>>>>
>>>>>> When using Infinispan it shouldn't be much different than the
>>>>>> RAMDirectory? so even if the needed segment is deleted, the IR
>>>>>> holds a
>>>>>> reference to the Java object locally since it was opened.
>>>>>>
>>>>>>  Łukcasz, do you have some failing test?
>>>>>>
>>>>>> Sanne
>>>>>>
>>>>>> 2009/9/23 Emmanuel Bernard <emmanuel at hibernate.org>:
>>>>>>> Conceptually I don't understand why it does work in a pure file
>>>>>>> system
>>>>>>> directory (ie IndexReader can go and process queries with the
>>>>>>> IndexWriter
>>>>>>> goes about its business) and not when using Infinispan.
>>>>>>> Is it because, the IndexWriter only clean files if no
>>>>>>> indexReaders are
>>>>>>> reading them (how would that be detected)?
>>>>>>> On 22 sept. 09, at 20:46, Łukasz Moreń wrote:
>>>>>>>
>>>>>>> I need to provide this same lifecycle for IndexWriter as for
>>>>>>> Infinispan
>>>>>>> tx -
>>>>>>> IW is created: tx is started, IW is commited: tx is commited. It
>>>>>>> assures
>>>>>>> that IndexReader doesn't read old data from directory.
>>>>>>> Infinispan transaction can be started when IW acquires the lock,
>>>>>>> but its
>>>>>>> commit on IW lock release, as it is done so far, causes a problem:
>>>>>>>
>>>>>>> index writer close {
>>>>>>>   index writer commit(); //changes are visible for IndexReaders
>>>>>>>
>>>>>>>        //Index reader starts reading here, i.e. tries to access
>>>>>>> file "A"
>>>>>>>
>>>>>>>   index writer lockRelease(); //changes in Infinispan directory are
>>>>>>> commited, file "A" was removed, IndexReader cannot find it and
>>>>>>> crashes
>>>>>>> }
>>>>>>>
>>>>>>> I think Infinispan tx have to be commited just before IW commit,
>>>>>>> and the
>>>>>>> problem is where to put in code.
>>>>>>>
>>>>>>> W dniu 22 września 2009 18:24 użytkownik Emmanuel Bernard
>>>>>>> <emmanuel at hibernate.org> napisał:
>>>>>>>> Can you explain in more details what is going on.
>>>>>>>> Aside from that Workspace has been Sanne's baby lately so he
>>>>>>>> will be
>>>>>>>> the
>>>>>>>> best to see what design will work in HSearch. That being said, I
>>>>>>>> don't
>>>>>>>> like
>>>>>>>> the idea of subclassing / overriding very much. In my
>>>>>>>> experience, it
>>>>>>>> has
>>>>>>>> lead to more bad and unmaintainable code than anything else.
>>>>>>>> On 22 sept. 09, at 02:16, Łukasz Moreń wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Thanks for explanation.
>>>>>>>> Maybe better I will concentrate on the first release and postpone
>>>>>>>> distributed writing.
>>>>>>>>
>>>>>>>> There is already LockStrategy that uses Infinispan. With using
>>>>>>>> it I was
>>>>>>>> wrapping changes made by IndexWriter in Infinispan transaction,
>>>>>>>> because
>>>>>>>> of
>>>>>>>> performance reasons -
>>>>>>>> on lock obtaining transaction was started, on lock release
>>>>>>>> transaction
>>>>>>>> was
>>>>>>>> commited. Hovewer Ispn transaction commit on lock release is not
>>>>>>>> good
>>>>>>>> idea
>>>>>>>> since IndexWriter calls index commit before lock is released(and
>>>>>>>> ispn
>>>>>>>> transaction is committed).
>>>>>>>> I was thinking to override Workspace class and getIndexWriter
>>>>>>>> (start
>>>>>>>> infinispan tx), commitIndexWriter (commit tx) methods to wrap
>>>>>>>> IndexWrite
>>>>>>>> lifecycle, but this needs few other changes. Some other ideas?
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Lukasz
>>>>>>>>
>>>>>>>> 2009/9/21 Sanne Grinovero <sanne.grinovero at gmail.com>
>>>>>>>>> Hi Łukasz,
>>>>>>>>> you've rightful concerns, because the way the IndexWriter tries
>>>>>>>>> to
>>>>>>>>> achieve the lock
>>>>>>>>> that will bring some trouble; As far as I remember we decided
>>>>>>>>> in this
>>>>>>>>> first release
>>>>>>>>> to avoid multiple writer nodes because of this reasons
>>>>>>>>> (that's written in your docs?)
>>>>>>>>>
>>>>>>>>> Actually it shouldn't be very hard to do, as the LockStrategy is
>>>>>>>>> pluggable (see changes from HSEARCH-345)
>>>>>>>>> and you could implement one delegating to an Infinispan eager
>>>>>>>>> lock on
>>>>>>>>> some key,
>>>>>>>>> like the default LockStrategy takes a file lock in the index
>>>>>>>>> directory.
>>>>>>>>>
>>>>>>>>> Maybe it's simpler to support this distributed writing instead of
>>>>>>>>> sending the queue to some single
>>>>>>>>> (elected) node? Would be cool, as the Document Analysis effort
>>>>>>>>> would
>>>>>>>>> be distributed,
>>>>>>>>> but I have no idea if this would be more or less efficient than a
>>>>>>>>> single node writing; it could
>>>>>>>>> bring some huge data transfers along the wire during segments
>>>>>>>>> merging
>>>>>>>>> (basically fetching
>>>>>>>>> the whole index data at each node performing a segment merge);
>>>>>>>>> maybe
>>>>>>>>> you'll need to
>>>>>>>>> play with IndexWriter settings (
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> http://docs.jboss.org/hibernate/stable/search/reference/en/html_single/#lucene-indexing-performance
>>>>>>>>> )
>>>>>>>>> probably need to find the sweet spot for "merge_factor".
>>>>>>>>> I just saw now that MergePolicy is now re-implementable, but I
>>>>>>>>> hope
>>>>>>>>> that won't be needed.
>>>>>>>>>
>>>>>>>>> Sanne
>>>>>>>>>
>>>>>>>>> 2009/9/21 Łukasz Moreń <lukasz.moren at gmail.com>:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I'm wondering if it is reasonable to have multiple threads/nodes
>>>>>>>>>> that
>>>>>>>>>> modifies indexes in Lucene Directory based on Infinispan? Let's
>>>>>>>>>> assume
>>>>>>>>>> that
>>>>>>>>>> two nodes try to update index in this same time. First one
>>>>>>>>>> creates
>>>>>>>>>> IndexWriter and obtains
>>>>>>>>>> write lock. There is high propability that second node throws
>>>>>>>>>> LockObtainFailedException (as one IndexWriter is allowed on
>>>>>>>>>> single
>>>>>>>>>> index)
>>>>>>>>>> and index is not modified. How is that? Should be always only
>>>>>>>>>> one
>>>>>>>>>> node
>>>>>>>>>> that
>>>>>>>>>> makes changes in
>>>>>>>>>> the index?
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Lukasz
>>>>>>>>>>
>>>>>>>>>> W dniu 15 września 2009 01:39 użytkownik Łukasz Moreń
>>>>>>>>>> <lukasz.moren at gmail.com> napisał:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> With using JMeter I wanted to check if Infinispan dir does not
>>>>>>>>>>> crash
>>>>>>>>>>> under
>>>>>>>>>>> heavy load in "real" use and check performance in comparison
>>>>>>>>>>> with
>>>>>>>>>>> none/other
>>>>>>>>>>> directories.
>>>>>>>>>>> However appeared problem when multiple IndexWriters tries to
>>>>>>>>>>> modify
>>>>>>>>>>> index
>>>>>>>>>>> (test InfinispanDirectoryTest) - random deadlocks, and Lucene
>>>>>>>>>>> exceptions.
>>>>>>>>>>> IndexWriter tries to access files in index that were removed
>>>>>>>>>>> before.
>>>>>>>>>>> I'm
>>>>>>>>>>> looking into it, but not having good idea.
>>>>>>>>>>>
>>>>>>>>>>> Concerning the last part, I think similar thing is done in
>>>>>>>>>>> InfinispanDirectoryProviderTest. Many threads are making
>>>>>>>>>>> changes
>>>>>>>>>>> and
>>>>>>>>>>> searching (not checking if db is in sync with index).
>>>>>>>>>>> If threads finish their work, with Lucene query I'm checking if
>>>>>>>>>>> index
>>>>>>>>>>> contains as many results as expected. Maybe you meant something
>>>>>>>>>>> else?
>>>>>>>>>>> Would be good to run each node in different VM.
>>>>>>>>>>>
>>>>>>>>>>>> Great ! Looking forward to it. What state are things in at the
>>>>>>>>>>>> moment
>>>>>>>>>>>> if I want to play around with it ?
>>>>>>>>>>> Should work with with one master(updates index) and one many
>>>>>>>>>>> slave
>>>>>>>>>>> nodes
>>>>>>>>>>> (sends changes to master). I tried with one master and one
>>>>>>>>>>> slave
>>>>>>>>>>> (both
>>>>>>>>>>> with
>>>>>>>>>>> jms and jgroups backend) and worked ok. Still fails if multiple
>>>>>>>>>>> nodes
>>>>>>>>>>> want
>>>>>>>>>>> to modify index.
>>>>>>>>>>>
>>>>>>>>>>> I've attached patch with current version.
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Łukasz
>>>>>>>>>>>
>>>>>>>>>>> 2009/9/13 Michael Neale <michael.neale at gmail.com>
>>>>>>>>>>>> Great ! Looking forward to it. What state are things in at the
>>>>>>>>>>>> moment
>>>>>>>>>>>> if I want to play around with it ?
>>>>>>>>>>>>
>>>>>>>>>>>> Sent from my phone.
>>>>>>>>>>>>
>>>>>>>>>>>> On 13/09/2009, at 7:26 PM, Sanne Grinovero
>>>>>>>>>>>> <sanne.grinovero at gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> 2009/9/12 Michael Neale <michael.neale at gmail.com>:
>>>>>>>>>>>>>> That does sounds pretty cool. Would be nice if the lucene
>>>>>>>>>>>>>> indexes
>>>>>>>>>>>>>> could scale along with how people will want to use
>>>>>>>>>>>>>> infinispan.
>>>>>>>>>>>>>> Probably worth playing with.
>>>>>>>>>>>>> Sure, this is the goal of Łukasz's work; We know compass has
>>>>>>>>>>>>> some good Directories, but we're building our own as one
>>>>>>>>>>>>> based
>>>>>>>>>>>>> on Infinispan is not yet available.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sent from my phone.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 13/09/2009, at 8:37 AM, Jeff Ramsdale
>>>>>>>>>>>>>> <jeff.ramsdale at gmail.com>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm afraid I haven't followed the Infinispan-Lucene
>>>>>>>>>>>>>>> implementation
>>>>>>>>>>>>>>> closely, but have you looked at the Compass Project?
>>>>>>>>>>>>>>> (http://www.compass-project.org/overview.html) It
>>>>>>>>>>>>>>> provides a
>>>>>>>>>>>>>>> simplified interface to Lucene (optional) as well as
>>>>>>>>>>>>>>> Directory
>>>>>>>>>>>>>>> implementations built on Terracotta, Gigaspaces and
>>>>>>>>>>>>>>> Coherence.
>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>> latter, in particular, might be a useful guide for the
>>>>>>>>>>>>>>> Infinispan
>>>>>>>>>>>>>>> implementation. I believe it's mature enough to have solved
>>>>>>>>>>>>>>> many
>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>> the most difficult problems of implementing Directory on a
>>>>>>>>>>>>>>> distributed
>>>>>>>>>>>>>>> Map.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If someone has any experience with Compass (particularly
>>>>>>>>>>>>>>> it's
>>>>>>>>>>>>>>> Directory implementations) I'd be interested in hearing
>>>>>>>>>>>>>>> about
>>>>>>>>>>>>>>> it...
>>>>>>>>>>>>>>> It's Apache 2.0 licensed, btw.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -jeff
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> infinispan-dev mailing list
>>>>>>>>>>>>>>> infinispan-dev at lists.jboss.org
>>>>>>>>>>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> infinispan-dev mailing list
>>>>>>>>>>>>>> infinispan-dev at lists.jboss.org
>>>>>>>>>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> infinispan-dev mailing list
>>>>>>>>>>>>> infinispan-dev at lists.jboss.org
>>>>>>>>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> infinispan-dev mailing list
>>>>>>>>>>>> infinispan-dev at lists.jboss.org
>>>>>>>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>> <
>>>>> InfinispanDirectoryProvider_22_09_2009
>>>>> .patch>_______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev at lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>
>>>>> --
>>>>> Manik Surtani
>>>>> manik at jboss.org
>>>>> Lead, Infinispan
>>>>> Lead, JBoss Cache
>>>>> http://www.infinispan.org
>>>>> http://www.jbosscache.org
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev at lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> --
>>> Manik Surtani
>>> manik at jboss.org
>>> Lead, Infinispan
>>> Lead, JBoss Cache
>>> http://www.infinispan.org
>>> http://www.jbosscache.org
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


-- 
Jason T. Greene
JBoss, a division of Red Hat



More information about the infinispan-dev mailing list