[infinispan-dev] Feedback on Infinispan patch

Jeff Ramsdale jeff.ramsdale at gmail.com
Thu Sep 24 16:29:39 EDT 2009


Untrue. Apache-licensed projects are certainly able to include LGPL
components. However, the Apache FOUNDATION does not allow LPGL
components in its own projects. This is not a limitation of the Apache
license, this is a matter of Foundation policy. As Compass is not a
product of the Apache Foundation it shouldn't be an issue (unless, of
course, they have their own policy against including LGPL components).

-jeff

On Thu, Sep 24, 2009 at 1:24 PM, Sanne Grinovero
<sanne.grinovero at gmail.com> wrote:
> Compass is Apache License, as Lucene is.
> Some Apache members would love to have this inside Lucene, but it
> appears to be a no-go because of the dependency on Infinispan
> (incompatible license); so either Compass can't have it either or it
> could be included into Lucene.
>
> There wouldn't be any problems if the code could depend on an Apache
> licensed API... if it's not out of question here to split into
> API/Impl.
>
> Sanne
>
> 2009/9/24 Manik Surtani <manik at jboss.org>:
>>
>> On 24 Sep 2009, at 17:36, Jeff Ramsdale wrote:
>>
>>> Another alternative would be to see if the Compass Project would be
>>> interested in hosting it: http://www.compass-project.org/
>>
>> Good idea.  Anybody have traction with the compass folk to propose this?
>>
>>> Even if Infinispan ends up hosting it there might be value in doing
>>> some cross-pollination with the Compass folks since this aligns
>>> directly with what they are working on.
>>>
>>> -jeff
>>>
>>> 2009/9/24 Manik Surtani <manik at jboss.org>:
>>>> Minorly off topic, but rather than working with patches, do we want
>>>> this
>>>> Directory impl in source control somewhere?
>>>> Being dependent on LGPL, it won't be accepted into Lucene's
>>>> contribs.  If it
>>>> doesn't depend on any Hibernate Search code, I could host it in
>>>> Infinispan's
>>>> SVN repo...
>>>>
>>>> On 23 Sep 2009, at 13:58, Łukasz Moreń wrote:
>>>>
>>>> I agree that Infinispan case is not much different from
>>>> RamDirectory. The
>>>> major difference is that in RD (also FileDirectory) changes are not
>>>> batched
>>>> like in ID. If I do not wrap changes in InfinispanDirectory(simple
>>>> remove
>>>> tx.begin() from obtain() method and tx.commit() from release() in
>>>> InfinispanLock), and immediately commit every change made by IW it
>>>> works
>>>> well. Hovewer it makes indexing really slower, because of frequent
>>>> replication to other nodes.
>>>> Sanne it's good remark that IW commit is kind of flush.
>>>>
>>>> I've attached patch with InfinispanDirectory, failing test is
>>>> testDirectoryWithMultipleThreads in InfinispanDirectoryTest class.
>>>> It fails
>>>> randomly. I think problem is Infinispan commit on lockRelease() in
>>>> org.apache.lucene.index.IndexWriter (line 1658) is after IW commit
>>>> () (line
>>>> 1654).
>>>>
>>>>> Is it because, the IndexWriter only clean files if no indexReaders
>>>>> are
>>>>> reading them (how would that be detected)?
>>>>
>>>> It can happen if IndexWriter clean file, and IndexReader try to
>>>> access that
>>>> cleaned file.
>>>>
>>>> 2009/9/23 Sanne Grinovero <sanne.grinovero at gmail.com>
>>>>>
>>>>> I agree It should work the same way; The IndexWriter cleans files
>>>>> whenever it likes to, it doesn't try to detect readers, and this
>>>>> shouldn't have any effect on the working of readers.
>>>>> The IndexReader opens the "SegmentsInfo" first, and immediately
>>>>> after** gets a reference to the segments listed in this
>>>>> SegmentsInfo.
>>>>> No IndexWriter will ever change an existing segment, only add new
>>>>> files or eventually delete old ones (segments merge,optimize).
>>>>> The deletion of segments is the interesting subject: when using
>>>>> Files
>>>>> it uses "delete at last close", which works because the IR needing
>>>>> it
>>>>> have it opened already**; when using the RAMDirectory they have a
>>>>> reference preventing garbage collection.
>>>>>
>>>>> ( the two "**" are assuming the same event occurred correctly,
>>>>> otherwise an exception is thrown at opening)
>>>>>
>>>>> When using Infinispan it shouldn't be much different than the
>>>>> RAMDirectory? so even if the needed segment is deleted, the IR
>>>>> holds a
>>>>> reference to the Java object locally since it was opened.
>>>>>
>>>>>  Łukcasz, do you have some failing test?
>>>>>
>>>>> Sanne
>>>>>
>>>>> 2009/9/23 Emmanuel Bernard <emmanuel at hibernate.org>:
>>>>>> Conceptually I don't understand why it does work in a pure file
>>>>>> system
>>>>>> directory (ie IndexReader can go and process queries with the
>>>>>> IndexWriter
>>>>>> goes about its business) and not when using Infinispan.
>>>>>> Is it because, the IndexWriter only clean files if no
>>>>>> indexReaders are
>>>>>> reading them (how would that be detected)?
>>>>>> On 22 sept. 09, at 20:46, Łukasz Moreń wrote:
>>>>>>
>>>>>> I need to provide this same lifecycle for IndexWriter as for
>>>>>> Infinispan
>>>>>> tx -
>>>>>> IW is created: tx is started, IW is commited: tx is commited. It
>>>>>> assures
>>>>>> that IndexReader doesn't read old data from directory.
>>>>>> Infinispan transaction can be started when IW acquires the lock,
>>>>>> but its
>>>>>> commit on IW lock release, as it is done so far, causes a problem:
>>>>>>
>>>>>> index writer close {
>>>>>>   index writer commit(); //changes are visible for IndexReaders
>>>>>>
>>>>>>        //Index reader starts reading here, i.e. tries to access
>>>>>> file "A"
>>>>>>
>>>>>>   index writer lockRelease(); //changes in Infinispan directory are
>>>>>> commited, file "A" was removed, IndexReader cannot find it and
>>>>>> crashes
>>>>>> }
>>>>>>
>>>>>> I think Infinispan tx have to be commited just before IW commit,
>>>>>> and the
>>>>>> problem is where to put in code.
>>>>>>
>>>>>> W dniu 22 września 2009 18:24 użytkownik Emmanuel Bernard
>>>>>> <emmanuel at hibernate.org> napisał:
>>>>>>>
>>>>>>> Can you explain in more details what is going on.
>>>>>>> Aside from that Workspace has been Sanne's baby lately so he
>>>>>>> will be
>>>>>>> the
>>>>>>> best to see what design will work in HSearch. That being said, I
>>>>>>> don't
>>>>>>> like
>>>>>>> the idea of subclassing / overriding very much. In my
>>>>>>> experience, it
>>>>>>> has
>>>>>>> lead to more bad and unmaintainable code than anything else.
>>>>>>> On 22 sept. 09, at 02:16, Łukasz Moreń wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Thanks for explanation.
>>>>>>> Maybe better I will concentrate on the first release and postpone
>>>>>>> distributed writing.
>>>>>>>
>>>>>>> There is already LockStrategy that uses Infinispan. With using
>>>>>>> it I was
>>>>>>> wrapping changes made by IndexWriter in Infinispan transaction,
>>>>>>> because
>>>>>>> of
>>>>>>> performance reasons -
>>>>>>> on lock obtaining transaction was started, on lock release
>>>>>>> transaction
>>>>>>> was
>>>>>>> commited. Hovewer Ispn transaction commit on lock release is not
>>>>>>> good
>>>>>>> idea
>>>>>>> since IndexWriter calls index commit before lock is released(and
>>>>>>> ispn
>>>>>>> transaction is committed).
>>>>>>> I was thinking to override Workspace class and getIndexWriter
>>>>>>> (start
>>>>>>> infinispan tx), commitIndexWriter (commit tx) methods to wrap
>>>>>>> IndexWrite
>>>>>>> lifecycle, but this needs few other changes. Some other ideas?
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Lukasz
>>>>>>>
>>>>>>> 2009/9/21 Sanne Grinovero <sanne.grinovero at gmail.com>
>>>>>>>>
>>>>>>>> Hi Łukasz,
>>>>>>>> you've rightful concerns, because the way the IndexWriter tries
>>>>>>>> to
>>>>>>>> achieve the lock
>>>>>>>> that will bring some trouble; As far as I remember we decided
>>>>>>>> in this
>>>>>>>> first release
>>>>>>>> to avoid multiple writer nodes because of this reasons
>>>>>>>> (that's written in your docs?)
>>>>>>>>
>>>>>>>> Actually it shouldn't be very hard to do, as the LockStrategy is
>>>>>>>> pluggable (see changes from HSEARCH-345)
>>>>>>>> and you could implement one delegating to an Infinispan eager
>>>>>>>> lock on
>>>>>>>> some key,
>>>>>>>> like the default LockStrategy takes a file lock in the index
>>>>>>>> directory.
>>>>>>>>
>>>>>>>> Maybe it's simpler to support this distributed writing instead of
>>>>>>>> sending the queue to some single
>>>>>>>> (elected) node? Would be cool, as the Document Analysis effort
>>>>>>>> would
>>>>>>>> be distributed,
>>>>>>>> but I have no idea if this would be more or less efficient than a
>>>>>>>> single node writing; it could
>>>>>>>> bring some huge data transfers along the wire during segments
>>>>>>>> merging
>>>>>>>> (basically fetching
>>>>>>>> the whole index data at each node performing a segment merge);
>>>>>>>> maybe
>>>>>>>> you'll need to
>>>>>>>> play with IndexWriter settings (
>>>>>>>>
>>>>>>>>
>>>>>>>> http://docs.jboss.org/hibernate/stable/search/reference/en/html_single/#lucene-indexing-performance
>>>>>>>> )
>>>>>>>> probably need to find the sweet spot for "merge_factor".
>>>>>>>> I just saw now that MergePolicy is now re-implementable, but I
>>>>>>>> hope
>>>>>>>> that won't be needed.
>>>>>>>>
>>>>>>>> Sanne
>>>>>>>>
>>>>>>>> 2009/9/21 Łukasz Moreń <lukasz.moren at gmail.com>:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I'm wondering if it is reasonable to have multiple threads/nodes
>>>>>>>>> that
>>>>>>>>> modifies indexes in Lucene Directory based on Infinispan? Let's
>>>>>>>>> assume
>>>>>>>>> that
>>>>>>>>> two nodes try to update index in this same time. First one
>>>>>>>>> creates
>>>>>>>>> IndexWriter and obtains
>>>>>>>>> write lock. There is high propability that second node throws
>>>>>>>>> LockObtainFailedException (as one IndexWriter is allowed on
>>>>>>>>> single
>>>>>>>>> index)
>>>>>>>>> and index is not modified. How is that? Should be always only
>>>>>>>>> one
>>>>>>>>> node
>>>>>>>>> that
>>>>>>>>> makes changes in
>>>>>>>>> the index?
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Lukasz
>>>>>>>>>
>>>>>>>>> W dniu 15 września 2009 01:39 użytkownik Łukasz Moreń
>>>>>>>>> <lukasz.moren at gmail.com> napisał:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> With using JMeter I wanted to check if Infinispan dir does not
>>>>>>>>>> crash
>>>>>>>>>> under
>>>>>>>>>> heavy load in "real" use and check performance in comparison
>>>>>>>>>> with
>>>>>>>>>> none/other
>>>>>>>>>> directories.
>>>>>>>>>> However appeared problem when multiple IndexWriters tries to
>>>>>>>>>> modify
>>>>>>>>>> index
>>>>>>>>>> (test InfinispanDirectoryTest) - random deadlocks, and Lucene
>>>>>>>>>> exceptions.
>>>>>>>>>> IndexWriter tries to access files in index that were removed
>>>>>>>>>> before.
>>>>>>>>>> I'm
>>>>>>>>>> looking into it, but not having good idea.
>>>>>>>>>>
>>>>>>>>>> Concerning the last part, I think similar thing is done in
>>>>>>>>>> InfinispanDirectoryProviderTest. Many threads are making
>>>>>>>>>> changes
>>>>>>>>>> and
>>>>>>>>>> searching (not checking if db is in sync with index).
>>>>>>>>>> If threads finish their work, with Lucene query I'm checking if
>>>>>>>>>> index
>>>>>>>>>> contains as many results as expected. Maybe you meant something
>>>>>>>>>> else?
>>>>>>>>>> Would be good to run each node in different VM.
>>>>>>>>>>
>>>>>>>>>>> Great ! Looking forward to it. What state are things in at the
>>>>>>>>>>> moment
>>>>>>>>>>> if I want to play around with it ?
>>>>>>>>>>
>>>>>>>>>> Should work with with one master(updates index) and one many
>>>>>>>>>> slave
>>>>>>>>>> nodes
>>>>>>>>>> (sends changes to master). I tried with one master and one
>>>>>>>>>> slave
>>>>>>>>>> (both
>>>>>>>>>> with
>>>>>>>>>> jms and jgroups backend) and worked ok. Still fails if multiple
>>>>>>>>>> nodes
>>>>>>>>>> want
>>>>>>>>>> to modify index.
>>>>>>>>>>
>>>>>>>>>> I've attached patch with current version.
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Łukasz
>>>>>>>>>>
>>>>>>>>>> 2009/9/13 Michael Neale <michael.neale at gmail.com>
>>>>>>>>>>>
>>>>>>>>>>> Great ! Looking forward to it. What state are things in at the
>>>>>>>>>>> moment
>>>>>>>>>>> if I want to play around with it ?
>>>>>>>>>>>
>>>>>>>>>>> Sent from my phone.
>>>>>>>>>>>
>>>>>>>>>>> On 13/09/2009, at 7:26 PM, Sanne Grinovero
>>>>>>>>>>> <sanne.grinovero at gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> 2009/9/12 Michael Neale <michael.neale at gmail.com>:
>>>>>>>>>>>>> That does sounds pretty cool. Would be nice if the lucene
>>>>>>>>>>>>> indexes
>>>>>>>>>>>>> could scale along with how people will want to use
>>>>>>>>>>>>> infinispan.
>>>>>>>>>>>>> Probably worth playing with.
>>>>>>>>>>>>
>>>>>>>>>>>> Sure, this is the goal of Łukasz's work; We know compass has
>>>>>>>>>>>> some good Directories, but we're building our own as one
>>>>>>>>>>>> based
>>>>>>>>>>>> on Infinispan is not yet available.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sent from my phone.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 13/09/2009, at 8:37 AM, Jeff Ramsdale
>>>>>>>>>>>>> <jeff.ramsdale at gmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm afraid I haven't followed the Infinispan-Lucene
>>>>>>>>>>>>>> implementation
>>>>>>>>>>>>>> closely, but have you looked at the Compass Project?
>>>>>>>>>>>>>> (http://www.compass-project.org/overview.html) It
>>>>>>>>>>>>>> provides a
>>>>>>>>>>>>>> simplified interface to Lucene (optional) as well as
>>>>>>>>>>>>>> Directory
>>>>>>>>>>>>>> implementations built on Terracotta, Gigaspaces and
>>>>>>>>>>>>>> Coherence.
>>>>>>>>>>>>>> The
>>>>>>>>>>>>>> latter, in particular, might be a useful guide for the
>>>>>>>>>>>>>> Infinispan
>>>>>>>>>>>>>> implementation. I believe it's mature enough to have solved
>>>>>>>>>>>>>> many
>>>>>>>>>>>>>> of
>>>>>>>>>>>>>> the most difficult problems of implementing Directory on a
>>>>>>>>>>>>>> distributed
>>>>>>>>>>>>>> Map.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If someone has any experience with Compass (particularly
>>>>>>>>>>>>>> it's
>>>>>>>>>>>>>> Directory implementations) I'd be interested in hearing
>>>>>>>>>>>>>> about
>>>>>>>>>>>>>> it...
>>>>>>>>>>>>>> It's Apache 2.0 licensed, btw.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -jeff
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> infinispan-dev mailing list
>>>>>>>>>>>>>> infinispan-dev at lists.jboss.org
>>>>>>>>>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> infinispan-dev mailing list
>>>>>>>>>>>>> infinispan-dev at lists.jboss.org
>>>>>>>>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> infinispan-dev mailing list
>>>>>>>>>>>> infinispan-dev at lists.jboss.org
>>>>>>>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> infinispan-dev mailing list
>>>>>>>>>>> infinispan-dev at lists.jboss.org
>>>>>>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>> <
>>>> InfinispanDirectoryProvider_22_09_2009
>>>> .patch>_______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>
>>>> --
>>>> Manik Surtani
>>>> manik at jboss.org
>>>> Lead, Infinispan
>>>> Lead, JBoss Cache
>>>> http://www.infinispan.org
>>>> http://www.jbosscache.org
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>> --
>> Manik Surtani
>> manik at jboss.org
>> Lead, Infinispan
>> Lead, JBoss Cache
>> http://www.infinispan.org
>> http://www.jbosscache.org
>>
>>
>>
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev




More information about the infinispan-dev mailing list