New subject: [infinispan-dev] Feedback on Infinispan patch

Sunday, 27 September 2009

You can try to incease TURNS_NUM (I've tried with 1000) and THREADS_NUM
(200) fields in InfinispanDirectoryTest to make it more propable. Same
problem appears also in InfinispanDirectoryProviderTest

An example stacktrace is:

21:22:44,441 ERROR InfinispanDirectoryTest:142 - Error
java.io.IOException: File [ segments_nl ] for index [ indexName ] was not
found
    at
org.hibernate.search.store.infinispan.InfinispanIndexIO$InfinispanIndexInput.<init>(InfinispanIndexIO.java:79)
    at
org.hibernate.search.store.infinispan.InfinispanDirectory.openInput(InfinispanDirectory.java:201)
    at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
    at
org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:95)
    at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
    at
org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:115)
    at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
    at org.apache.lucene.index.IndexReader.open(IndexReader.java:227)
    at org.apache.lucene.search.IndexSearcher.<init>(IndexSearcher.java:55)
    at
org.hibernate.search.test.directoryProvider.infinispan.CacheTestSupport.doReadOperation(CacheTestSupport.java:106)
    at
org.hibernate.search.test.directoryProvider.infinispan.InfinispanDirectoryTest$InfinispanDirectoryThread.run(InfinispanDirectoryTest.java:130)

Cheers,
Lukasz

2009/9/27 Sanne Grinovero <sanne.grinovero(a)gmail.com&gt;

...
 Hi Łukasz,
 I'm unable to reproduce the problem, you said it happens randomly:
 I've tried several times
 and I'm not getting errors. Do you know something I could do to make it
 happen?
 Could you share a stacktrace?

 Anyway if you are confident it's about the segments getting lost when
 they are still being read,
 you could introduce a per-segment counter of usage; like it starts at
 value 1 to mark the segment
 as "most current", gets a +1 vote at each reader opening it, -1
 closing, and -1 deleting.
 Each decrement method should check for the value reaching 0 to really
 delete it,
 and this counting method would be easy to add inside the Directory.
 When opening a new indexReader, you
 1) get the SegmentsInfo
 2) increment all counters (eager-lock, verify>0 or retry : set changed
 counters back and get a new SegmentsInfo-->1)
 3) get the needed segments

 Getting a counter should be much faster than getting a segment in case
 the data is downloaded
 from another node, so we can use a different key while still relating
 to the segment.

 Sanne

 2009/9/23 Łukasz Moreń <lukasz.moren(a)gmail.com&gt;:
 > I agree that Infinispan case is not much different from RamDirectory. The
 > major difference is that in RD (also FileDirectory) changes are not
 batched
 > like in ID. If I do not wrap changes in InfinispanDirectory(simple remove
 > tx.begin() from obtain() method and tx.commit() from release() in
 > InfinispanLock), and immediately commit every change made by IW it works
 > well. Hovewer it makes indexing really slower, because of frequent
 > replication to other nodes.
 > Sanne it's good remark that IW commit is kind of flush.
 >
 > I've attached patch with InfinispanDirectory, failing test is
 > testDirectoryWithMultipleThreads in InfinispanDirectoryTest class. It
 fails
 > randomly. I think problem is Infinispan commit on lockRelease() in
 > org.apache.lucene.index.IndexWriter (line 1658) is after IW commit()
 (line
 > 1654).
 >
 >> Is it because, the IndexWriter only clean files if no indexReaders are
 >> reading them (how would that be detected)?
 >
 > It can happen if IndexWriter clean file, and IndexReader try to access
 that
 > cleaned file.
 >
 > 2009/9/23 Sanne Grinovero <sanne.grinovero(a)gmail.com&gt;
 >>
 >> I agree It should work the same way; The IndexWriter cleans files
 >> whenever it likes to, it doesn't try to detect readers, and this
 >> shouldn't have any effect on the working of readers.
 >> The IndexReader opens the "SegmentsInfo" first, and immediately
 >> after** gets a reference to the segments listed in this SegmentsInfo.
 >> No IndexWriter will ever change an existing segment, only add new
 >> files or eventually delete old ones (segments merge,optimize).
 >> The deletion of segments is the interesting subject: when using Files
 >> it uses "delete at last close", which works because the IR needing it
 >> have it opened already**; when using the RAMDirectory they have a
 >> reference preventing garbage collection.
 >>
 >> ( the two "**" are assuming the same event occurred correctly,
 >> otherwise an exception is thrown at opening)
 >>
 >> When using Infinispan it shouldn't be much different than the
 >> RAMDirectory? so even if the needed segment is deleted, the IR holds a
 >> reference to the Java object locally since it was opened.
 >>
 >>  Łukcasz, do you have some failing test?
 >>
 >> Sanne
 >>
 >> 2009/9/23 Emmanuel Bernard <emmanuel(a)hibernate.org&gt;:
 >> > Conceptually I don't understand why it does work in a pure file system
 >> > directory (ie IndexReader can go and process queries with the
 >> > IndexWriter
 >> > goes about its business) and not when using Infinispan.
 >> > Is it because, the IndexWriter only clean files if no indexReaders are
 >> > reading them (how would that be detected)?
 >> > On 22 sept. 09, at 20:46, Łukasz Moreń wrote:
 >> >
 >> > I need to provide this same lifecycle for IndexWriter as for
 Infinispan
 >> > tx -
 >> > IW is created: tx is started, IW is commited: tx is commited. It
 assures
 >> > that IndexReader doesn't read old data from directory.
 >> > Infinispan transaction can be started when IW acquires the lock, but
 its
 >> > commit on IW lock release, as it is done so far, causes a problem:
 >> >
 >> > index writer close {
 >> >   index writer commit(); //changes are visible for IndexReaders
 >> >
 >> >        //Index reader starts reading here, i.e. tries to access file
 "A"
 >> >
 >> >   index writer lockRelease(); //changes in Infinispan directory are
 >> > commited, file "A" was removed, IndexReader cannot find it and
crashes
 >> > }
 >> >
 >> > I think Infinispan tx have to be commited just before IW commit, and
 the
 >> > problem is where to put in code.
 >> >
 >> > W dniu 22 września 2009 18:24 użytkownik Emmanuel Bernard
 >> > <emmanuel(a)hibernate.org&gt; napisał:
 >> >>
 >> >> Can you explain in more details what is going on.
 >> >> Aside from that Workspace has been Sanne's baby lately so he will
be
 >> >> the
 >> >> best to see what design will work in HSearch. That being said, I
 don't
 >> >> like
 >> >> the idea of subclassing / overriding very much. In my experience, it
 >> >> has
 >> >> lead to more bad and unmaintainable code than anything else.
 >> >> On 22 sept. 09, at 02:16, Łukasz Moreń wrote:
 >> >>
 >> >> Hi,
 >> >>
 >> >> Thanks for explanation.
 >> >> Maybe better I will concentrate on the first release and postpone
 >> >> distributed writing.
 >> >>
 >> >> There is already LockStrategy that uses Infinispan. With using it I
 was
 >> >> wrapping changes made by IndexWriter in Infinispan transaction,
 because
 >> >> of
 >> >> performance reasons -
 >> >> on lock obtaining transaction was started, on lock release
 transaction
 >> >> was
 >> >> commited. Hovewer Ispn transaction commit on lock release is not good
 >> >> idea
 >> >> since IndexWriter calls index commit before lock is released(and ispn
 >> >> transaction is committed).
 >> >> I was thinking to override Workspace class and getIndexWriter(start
 >> >> infinispan tx), commitIndexWriter (commit tx) methods to wrap
 >> >> IndexWrite
 >> >> lifecycle, but this needs few other changes. Some other ideas?
 >> >>
 >> >> Cheers,
 >> >> Lukasz
 >> >>
 >> >> 2009/9/21 Sanne Grinovero <sanne.grinovero(a)gmail.com&gt;
 >> >>>
 >> >>> Hi Łukasz,
 >> >>> you've rightful concerns, because the way the IndexWriter tries
to
 >> >>> achieve the lock
 >> >>> that will bring some trouble; As far as I remember we decided in
 this
 >> >>> first release
 >> >>> to avoid multiple writer nodes because of this reasons
 >> >>> (that's written in your docs?)
 >> >>>
 >> >>> Actually it shouldn't be very hard to do, as the LockStrategy
is
 >> >>> pluggable (see changes from HSEARCH-345)
 >> >>> and you could implement one delegating to an Infinispan eager lock
 on
 >> >>> some key,
 >> >>> like the default LockStrategy takes a file lock in the index
 >> >>> directory.
 >> >>>
 >> >>> Maybe it's simpler to support this distributed writing instead
of
 >> >>> sending the queue to some single
 >> >>> (elected) node? Would be cool, as the Document Analysis effort
would
 >> >>> be distributed,
 >> >>> but I have no idea if this would be more or less efficient than a
 >> >>> single node writing; it could
 >> >>> bring some huge data transfers along the wire during segments
 merging
 >> >>> (basically fetching
 >> >>> the whole index data at each node performing a segment merge);
maybe
 >> >>> you'll need to
 >> >>> play with IndexWriter settings (
 >> >>>
 >> >>>
 >> >>>

http://docs.jboss.org/hibernate/stable/search/reference/en/html_single/#l...
 >> >>> )
 >> >>> probably need to find the sweet spot for "merge_factor".
 >> >>> I just saw now that MergePolicy is now re-implementable, but I
hope
 >> >>> that won't be needed.
 >> >>>
 >> >>> Sanne
 >> >>>
 >> >>> 2009/9/21 Łukasz Moreń <lukasz.moren(a)gmail.com&gt;:
 >> >>> > Hi,
 >> >>> >
 >> >>> > I'm wondering if it is reasonable to have multiple
threads/nodes
 >> >>> > that
 >> >>> > modifies indexes in Lucene Directory based on Infinispan?
Let's
 >> >>> > assume
 >> >>> > that
 >> >>> > two nodes try to update index in this same time. First one
creates
 >> >>> > IndexWriter and obtains
 >> >>> > write lock. There is high propability that second node throws
 >> >>> > LockObtainFailedException (as one IndexWriter is allowed on
single
 >> >>> > index)
 >> >>> > and index is not modified. How is that? Should be always only
one
 >> >>> > node
 >> >>> > that
 >> >>> > makes changes in
 >> >>> > the index?
 >> >>> >
 >> >>> > Cheers,
 >> >>> > Lukasz
 >> >>> >
 >> >>> > W dniu 15 września 2009 01:39 użytkownik Łukasz Moreń
 >> >>> > <lukasz.moren(a)gmail.com&gt; napisał:
 >> >>> >>
 >> >>> >> Hi,
 >> >>> >>
 >> >>> >> With using JMeter I wanted to check if Infinispan dir does
not
 >> >>> >> crash
 >> >>> >> under
 >> >>> >> heavy load in "real" use and check performance
in comparison with
 >> >>> >> none/other
 >> >>> >> directories.
 >> >>> >> However appeared problem when multiple IndexWriters tries
to
 modify
 >> >>> >> index
 >> >>> >> (test InfinispanDirectoryTest) - random deadlocks, and
Lucene
 >> >>> >> exceptions.
 >> >>> >> IndexWriter tries to access files in index that were
removed
 >> >>> >> before.
 >> >>> >> I'm
 >> >>> >> looking into it, but not having good idea.
 >> >>> >>
 >> >>> >> Concerning the last part, I think similar thing is done
in
 >> >>> >> InfinispanDirectoryProviderTest. Many threads are making
changes
 >> >>> >> and
 >> >>> >> searching (not checking if db is in sync with index).
 >> >>> >> If threads finish their work, with Lucene query I'm
checking if
 >> >>> >> index
 >> >>> >> contains as many results as expected. Maybe you meant
something
 >> >>> >> else?
 >> >>> >> Would be good to run each node in different VM.
 >> >>> >>
 >> >>> >>> Great ! Looking forward to it. What state are things
in at the
 >> >>> >>> moment
 >> >>> >>> if I want to play around with it ?
 >> >>> >>
 >> >>> >> Should work with with one master(updates index) and one
many
 slave
 >> >>> >> nodes
 >> >>> >> (sends changes to master). I tried with one master and one
slave
 >> >>> >> (both
 >> >>> >> with
 >> >>> >> jms and jgroups backend) and worked ok. Still fails if
multiple
 >> >>> >> nodes
 >> >>> >> want
 >> >>> >> to modify index.
 >> >>> >>
 >> >>> >> I've attached patch with current version.
 >> >>> >>
 >> >>> >> Cheers,
 >> >>> >> Łukasz
 >> >>> >>
 >> >>> >> 2009/9/13 Michael Neale <michael.neale(a)gmail.com&gt;
 >> >>> >>>
 >> >>> >>> Great ! Looking forward to it. What state are things
in at the
 >> >>> >>> moment
 >> >>> >>> if I want to play around with it ?
 >> >>> >>>
 >> >>> >>> Sent from my phone.
 >> >>> >>>
 >> >>> >>> On 13/09/2009, at 7:26 PM, Sanne Grinovero
 >> >>> >>> <sanne.grinovero(a)gmail.com&gt;
 >> >>> >>> wrote:
 >> >>> >>>
 >> >>> >>> > 2009/9/12 Michael Neale
<michael.neale(a)gmail.com&gt;:
 >> >>> >>> >> That does sounds pretty cool. Would be nice
if the lucene
 >> >>> >>> >> indexes
 >> >>> >>> >> could scale along with how people will want
to use
 infinispan.
 >> >>> >>> >> Probably worth playing with.
 >> >>> >>> >
 >> >>> >>> > Sure, this is the goal of Łukasz's work; We
know compass has
 >> >>> >>> > some good Directories, but we're building our
own as one based
 >> >>> >>> > on Infinispan is not yet available.
 >> >>> >>> >
 >> >>> >>> >>
 >> >>> >>> >> Sent from my phone.
 >> >>> >>> >>
 >> >>> >>> >> On 13/09/2009, at 8:37 AM, Jeff Ramsdale
 >> >>> >>> >> <jeff.ramsdale(a)gmail.com&gt;
 >> >>> >>> >> wrote:
 >> >>> >>> >>
 >> >>> >>> >>> I'm afraid I haven't followed the
Infinispan-Lucene
 >> >>> >>> >>> implementation
 >> >>> >>> >>> closely, but have you looked at the
Compass Project?
 >> >>> >>> >>>
(http://www.compass-project.org/overview.html) It provides
 a
 >> >>> >>> >>> simplified interface to Lucene (optional)
as well as
 Directory
 >> >>> >>> >>> implementations built on Terracotta,
Gigaspaces and
 Coherence.
 >> >>> >>> >>> The
 >> >>> >>> >>> latter, in particular, might be a useful
guide for the
 >> >>> >>> >>> Infinispan
 >> >>> >>> >>> implementation. I believe it's mature
enough to have solved
 >> >>> >>> >>> many
 >> >>> >>> >>> of
 >> >>> >>> >>> the most difficult problems of
implementing Directory on a
 >> >>> >>> >>> distributed
 >> >>> >>> >>> Map.
 >> >>> >>> >>>
 >> >>> >>> >>> If someone has any experience with
Compass (particularly
 it's
 >> >>> >>> >>> Directory implementations) I'd be
interested in hearing
 about
 >> >>> >>> >>> it...
 >> >>> >>> >>> It's Apache 2.0 licensed, btw.
 >> >>> >>> >>>
 >> >>> >>> >>> -jeff
 >> >>> >>> >>>
_______________________________________________
 >> >>> >>> >>> infinispan-dev mailing list
 >> >>> >>> >>> infinispan-dev(a)lists.jboss.org
 >> >>> >>> >>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
 >> >>> >>> >>
_______________________________________________
 >> >>> >>> >> infinispan-dev mailing list
 >> >>> >>> >> infinispan-dev(a)lists.jboss.org
 >> >>> >>> >>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
 >> >>> >>> >>
 >> >>> >>> >
 >> >>> >>> > _______________________________________________
 >> >>> >>> > infinispan-dev mailing list
 >> >>> >>> > infinispan-dev(a)lists.jboss.org
 >> >>> >>> >
https://lists.jboss.org/mailman/listinfo/infinispan-dev
 >> >>> >>>
 >> >>> >>> _______________________________________________
 >> >>> >>> infinispan-dev mailing list
 >> >>> >>> infinispan-dev(a)lists.jboss.org
 >> >>> >>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
 >> >>> >
 >> >>> >
 >> >>
 >> >>
 >> >
 >> >
 >> >
 >
 >

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [hibernate-dev] [infinispan-dev] Feedback on Infinispan patch