You can try to incease TURNS_NUM (I've tried with 1000) and THREADS_NUM (200) fields in InfinispanDirectoryTest to make it more propable. Same problem appears also in InfinispanDirectoryProviderTest<br><br>An example stacktrace is:<br>
<br>21:22:44,441 ERROR InfinispanDirectoryTest:142 - Error<br>java.io.IOException: File [ segments_nl ] for index [ indexName ] was not found<br> at org.hibernate.search.store.infinispan.InfinispanIndexIO$InfinispanIndexInput.<init>(InfinispanIndexIO.java:79)<br>
at org.hibernate.search.store.infinispan.InfinispanDirectory.openInput(InfinispanDirectory.java:201)<br> at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)<br> at org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:95)<br>
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)<br> at org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:115)<br> at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)<br>
at org.apache.lucene.index.IndexReader.open(IndexReader.java:227)<br> at org.apache.lucene.search.IndexSearcher.<init>(IndexSearcher.java:55)<br> at org.hibernate.search.test.directoryProvider.infinispan.CacheTestSupport.doReadOperation(CacheTestSupport.java:106)<br>
at org.hibernate.search.test.directoryProvider.infinispan.InfinispanDirectoryTest$InfinispanDirectoryThread.run(InfinispanDirectoryTest.java:130)<br><br>Cheers,<br>Lukasz<br><br><div class="gmail_quote">2009/9/27 Sanne Grinovero <span dir="ltr"><<a href="mailto:sanne.grinovero@gmail.com">sanne.grinovero@gmail.com</a>></span><br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Hi Łukasz,<br>
I'm unable to reproduce the problem, you said it happens randomly:<br>
I've tried several times<br>
and I'm not getting errors. Do you know something I could do to make it happen?<br>
Could you share a stacktrace?<br>
<br>
Anyway if you are confident it's about the segments getting lost when<br>
they are still being read,<br>
you could introduce a per-segment counter of usage; like it starts at<br>
value 1 to mark the segment<br>
as "most current", gets a +1 vote at each reader opening it, -1<br>
closing, and -1 deleting.<br>
Each decrement method should check for the value reaching 0 to really delete it,<br>
and this counting method would be easy to add inside the Directory.<br>
When opening a new indexReader, you<br>
1) get the SegmentsInfo<br>
2) increment all counters (eager-lock, verify>0 or retry : set changed<br>
counters back and get a new SegmentsInfo-->1)<br>
3) get the needed segments<br>
<br>
Getting a counter should be much faster than getting a segment in case<br>
the data is downloaded<br>
from another node, so we can use a different key while still relating<br>
to the segment.<br>
<br>
Sanne<br>
<br>
2009/9/23 Łukasz Moreń <<a href="mailto:lukasz.moren@gmail.com">lukasz.moren@gmail.com</a>>:<br>
<div><div></div><div class="h5">> I agree that Infinispan case is not much different from RamDirectory. The<br>
> major difference is that in RD (also FileDirectory) changes are not batched<br>
> like in ID. If I do not wrap changes in InfinispanDirectory(simple remove<br>
> tx.begin() from obtain() method and tx.commit() from release() in<br>
> InfinispanLock), and immediately commit every change made by IW it works<br>
> well. Hovewer it makes indexing really slower, because of frequent<br>
> replication to other nodes.<br>
> Sanne it's good remark that IW commit is kind of flush.<br>
><br>
> I've attached patch with InfinispanDirectory, failing test is<br>
> testDirectoryWithMultipleThreads in InfinispanDirectoryTest class. It fails<br>
> randomly. I think problem is Infinispan commit on lockRelease() in<br>
> org.apache.lucene.index.IndexWriter (line 1658) is after IW commit() (line<br>
> 1654).<br>
><br>
>> Is it because, the IndexWriter only clean files if no indexReaders are<br>
>> reading them (how would that be detected)?<br>
><br>
> It can happen if IndexWriter clean file, and IndexReader try to access that<br>
> cleaned file.<br>
><br>
> 2009/9/23 Sanne Grinovero <<a href="mailto:sanne.grinovero@gmail.com">sanne.grinovero@gmail.com</a>><br>
>><br>
>> I agree It should work the same way; The IndexWriter cleans files<br>
>> whenever it likes to, it doesn't try to detect readers, and this<br>
>> shouldn't have any effect on the working of readers.<br>
>> The IndexReader opens the "SegmentsInfo" first, and immediately<br>
>> after** gets a reference to the segments listed in this SegmentsInfo.<br>
>> No IndexWriter will ever change an existing segment, only add new<br>
>> files or eventually delete old ones (segments merge,optimize).<br>
>> The deletion of segments is the interesting subject: when using Files<br>
>> it uses "delete at last close", which works because the IR needing it<br>
>> have it opened already**; when using the RAMDirectory they have a<br>
>> reference preventing garbage collection.<br>
>><br>
>> ( the two "**" are assuming the same event occurred correctly,<br>
>> otherwise an exception is thrown at opening)<br>
>><br>
>> When using Infinispan it shouldn't be much different than the<br>
>> RAMDirectory? so even if the needed segment is deleted, the IR holds a<br>
>> reference to the Java object locally since it was opened.<br>
>><br>
>> Łukcasz, do you have some failing test?<br>
>><br>
>> Sanne<br>
>><br>
>> 2009/9/23 Emmanuel Bernard <<a href="mailto:emmanuel@hibernate.org">emmanuel@hibernate.org</a>>:<br>
>> > Conceptually I don't understand why it does work in a pure file system<br>
>> > directory (ie IndexReader can go and process queries with the<br>
>> > IndexWriter<br>
>> > goes about its business) and not when using Infinispan.<br>
>> > Is it because, the IndexWriter only clean files if no indexReaders are<br>
>> > reading them (how would that be detected)?<br>
>> > On 22 sept. 09, at 20:46, Łukasz Moreń wrote:<br>
>> ><br>
>> > I need to provide this same lifecycle for IndexWriter as for Infinispan<br>
>> > tx -<br>
>> > IW is created: tx is started, IW is commited: tx is commited. It assures<br>
>> > that IndexReader doesn't read old data from directory.<br>
>> > Infinispan transaction can be started when IW acquires the lock, but its<br>
>> > commit on IW lock release, as it is done so far, causes a problem:<br>
>> ><br>
>> > index writer close {<br>
>> > index writer commit(); //changes are visible for IndexReaders<br>
>> ><br>
>> > //Index reader starts reading here, i.e. tries to access file "A"<br>
>> ><br>
>> > index writer lockRelease(); //changes in Infinispan directory are<br>
>> > commited, file "A" was removed, IndexReader cannot find it and crashes<br>
>> > }<br>
>> ><br>
>> > I think Infinispan tx have to be commited just before IW commit, and the<br>
>> > problem is where to put in code.<br>
>> ><br>
>> > W dniu 22 września 2009 18:24 użytkownik Emmanuel Bernard<br>
>> > <<a href="mailto:emmanuel@hibernate.org">emmanuel@hibernate.org</a>> napisał:<br>
>> >><br>
>> >> Can you explain in more details what is going on.<br>
>> >> Aside from that Workspace has been Sanne's baby lately so he will be<br>
>> >> the<br>
>> >> best to see what design will work in HSearch. That being said, I don't<br>
>> >> like<br>
>> >> the idea of subclassing / overriding very much. In my experience, it<br>
>> >> has<br>
>> >> lead to more bad and unmaintainable code than anything else.<br>
>> >> On 22 sept. 09, at 02:16, Łukasz Moreń wrote:<br>
>> >><br>
>> >> Hi,<br>
>> >><br>
>> >> Thanks for explanation.<br>
>> >> Maybe better I will concentrate on the first release and postpone<br>
>> >> distributed writing.<br>
>> >><br>
>> >> There is already LockStrategy that uses Infinispan. With using it I was<br>
>> >> wrapping changes made by IndexWriter in Infinispan transaction, because<br>
>> >> of<br>
>> >> performance reasons -<br>
>> >> on lock obtaining transaction was started, on lock release transaction<br>
>> >> was<br>
>> >> commited. Hovewer Ispn transaction commit on lock release is not good<br>
>> >> idea<br>
>> >> since IndexWriter calls index commit before lock is released(and ispn<br>
>> >> transaction is committed).<br>
>> >> I was thinking to override Workspace class and getIndexWriter(start<br>
>> >> infinispan tx), commitIndexWriter (commit tx) methods to wrap<br>
>> >> IndexWrite<br>
>> >> lifecycle, but this needs few other changes. Some other ideas?<br>
>> >><br>
>> >> Cheers,<br>
>> >> Lukasz<br>
>> >><br>
>> >> 2009/9/21 Sanne Grinovero <<a href="mailto:sanne.grinovero@gmail.com">sanne.grinovero@gmail.com</a>><br>
>> >>><br>
>> >>> Hi Łukasz,<br>
>> >>> you've rightful concerns, because the way the IndexWriter tries to<br>
>> >>> achieve the lock<br>
>> >>> that will bring some trouble; As far as I remember we decided in this<br>
>> >>> first release<br>
>> >>> to avoid multiple writer nodes because of this reasons<br>
>> >>> (that's written in your docs?)<br>
>> >>><br>
>> >>> Actually it shouldn't be very hard to do, as the LockStrategy is<br>
>> >>> pluggable (see changes from HSEARCH-345)<br>
>> >>> and you could implement one delegating to an Infinispan eager lock on<br>
>> >>> some key,<br>
>> >>> like the default LockStrategy takes a file lock in the index<br>
>> >>> directory.<br>
>> >>><br>
>> >>> Maybe it's simpler to support this distributed writing instead of<br>
>> >>> sending the queue to some single<br>
>> >>> (elected) node? Would be cool, as the Document Analysis effort would<br>
>> >>> be distributed,<br>
>> >>> but I have no idea if this would be more or less efficient than a<br>
>> >>> single node writing; it could<br>
>> >>> bring some huge data transfers along the wire during segments merging<br>
>> >>> (basically fetching<br>
>> >>> the whole index data at each node performing a segment merge); maybe<br>
>> >>> you'll need to<br>
>> >>> play with IndexWriter settings (<br>
>> >>><br>
>> >>><br>
>> >>> <a href="http://docs.jboss.org/hibernate/stable/search/reference/en/html_single/#lucene-indexing-performance" target="_blank">http://docs.jboss.org/hibernate/stable/search/reference/en/html_single/#lucene-indexing-performance</a><br>
>> >>> )<br>
>> >>> probably need to find the sweet spot for "merge_factor".<br>
>> >>> I just saw now that MergePolicy is now re-implementable, but I hope<br>
>> >>> that won't be needed.<br>
>> >>><br>
>> >>> Sanne<br>
>> >>><br>
>> >>> 2009/9/21 Łukasz Moreń <<a href="mailto:lukasz.moren@gmail.com">lukasz.moren@gmail.com</a>>:<br>
>> >>> > Hi,<br>
>> >>> ><br>
>> >>> > I'm wondering if it is reasonable to have multiple threads/nodes<br>
>> >>> > that<br>
>> >>> > modifies indexes in Lucene Directory based on Infinispan? Let's<br>
>> >>> > assume<br>
>> >>> > that<br>
>> >>> > two nodes try to update index in this same time. First one creates<br>
>> >>> > IndexWriter and obtains<br>
>> >>> > write lock. There is high propability that second node throws<br>
>> >>> > LockObtainFailedException (as one IndexWriter is allowed on single<br>
>> >>> > index)<br>
>> >>> > and index is not modified. How is that? Should be always only one<br>
>> >>> > node<br>
>> >>> > that<br>
>> >>> > makes changes in<br>
>> >>> > the index?<br>
>> >>> ><br>
>> >>> > Cheers,<br>
>> >>> > Lukasz<br>
>> >>> ><br>
>> >>> > W dniu 15 września 2009 01:39 użytkownik Łukasz Moreń<br>
>> >>> > <<a href="mailto:lukasz.moren@gmail.com">lukasz.moren@gmail.com</a>> napisał:<br>
>> >>> >><br>
>> >>> >> Hi,<br>
>> >>> >><br>
>> >>> >> With using JMeter I wanted to check if Infinispan dir does not<br>
>> >>> >> crash<br>
>> >>> >> under<br>
>> >>> >> heavy load in "real" use and check performance in comparison with<br>
>> >>> >> none/other<br>
>> >>> >> directories.<br>
>> >>> >> However appeared problem when multiple IndexWriters tries to modify<br>
>> >>> >> index<br>
>> >>> >> (test InfinispanDirectoryTest) - random deadlocks, and Lucene<br>
>> >>> >> exceptions.<br>
>> >>> >> IndexWriter tries to access files in index that were removed<br>
>> >>> >> before.<br>
>> >>> >> I'm<br>
>> >>> >> looking into it, but not having good idea.<br>
>> >>> >><br>
>> >>> >> Concerning the last part, I think similar thing is done in<br>
>> >>> >> InfinispanDirectoryProviderTest. Many threads are making changes<br>
>> >>> >> and<br>
>> >>> >> searching (not checking if db is in sync with index).<br>
>> >>> >> If threads finish their work, with Lucene query I'm checking if<br>
>> >>> >> index<br>
>> >>> >> contains as many results as expected. Maybe you meant something<br>
>> >>> >> else?<br>
>> >>> >> Would be good to run each node in different VM.<br>
>> >>> >><br>
>> >>> >>> Great ! Looking forward to it. What state are things in at the<br>
>> >>> >>> moment<br>
>> >>> >>> if I want to play around with it ?<br>
>> >>> >><br>
>> >>> >> Should work with with one master(updates index) and one many slave<br>
>> >>> >> nodes<br>
>> >>> >> (sends changes to master). I tried with one master and one slave<br>
>> >>> >> (both<br>
>> >>> >> with<br>
>> >>> >> jms and jgroups backend) and worked ok. Still fails if multiple<br>
>> >>> >> nodes<br>
>> >>> >> want<br>
>> >>> >> to modify index.<br>
>> >>> >><br>
>> >>> >> I've attached patch with current version.<br>
>> >>> >><br>
>> >>> >> Cheers,<br>
>> >>> >> Łukasz<br>
>> >>> >><br>
>> >>> >> 2009/9/13 Michael Neale <<a href="mailto:michael.neale@gmail.com">michael.neale@gmail.com</a>><br>
>> >>> >>><br>
>> >>> >>> Great ! Looking forward to it. What state are things in at the<br>
>> >>> >>> moment<br>
>> >>> >>> if I want to play around with it ?<br>
>> >>> >>><br>
>> >>> >>> Sent from my phone.<br>
>> >>> >>><br>
>> >>> >>> On 13/09/2009, at 7:26 PM, Sanne Grinovero<br>
>> >>> >>> <<a href="mailto:sanne.grinovero@gmail.com">sanne.grinovero@gmail.com</a>><br>
>> >>> >>> wrote:<br>
>> >>> >>><br>
>> >>> >>> > 2009/9/12 Michael Neale <<a href="mailto:michael.neale@gmail.com">michael.neale@gmail.com</a>>:<br>
>> >>> >>> >> That does sounds pretty cool. Would be nice if the lucene<br>
>> >>> >>> >> indexes<br>
>> >>> >>> >> could scale along with how people will want to use infinispan.<br>
>> >>> >>> >> Probably worth playing with.<br>
>> >>> >>> ><br>
>> >>> >>> > Sure, this is the goal of Łukasz's work; We know compass has<br>
>> >>> >>> > some good Directories, but we're building our own as one based<br>
>> >>> >>> > on Infinispan is not yet available.<br>
>> >>> >>> ><br>
>> >>> >>> >><br>
>> >>> >>> >> Sent from my phone.<br>
>> >>> >>> >><br>
>> >>> >>> >> On 13/09/2009, at 8:37 AM, Jeff Ramsdale<br>
>> >>> >>> >> <<a href="mailto:jeff.ramsdale@gmail.com">jeff.ramsdale@gmail.com</a>><br>
>> >>> >>> >> wrote:<br>
>> >>> >>> >><br>
>> >>> >>> >>> I'm afraid I haven't followed the Infinispan-Lucene<br>
>> >>> >>> >>> implementation<br>
>> >>> >>> >>> closely, but have you looked at the Compass Project?<br>
>> >>> >>> >>> (<a href="http://www.compass-project.org/overview.html" target="_blank">http://www.compass-project.org/overview.html</a>) It provides a<br>
>> >>> >>> >>> simplified interface to Lucene (optional) as well as Directory<br>
>> >>> >>> >>> implementations built on Terracotta, Gigaspaces and Coherence.<br>
>> >>> >>> >>> The<br>
>> >>> >>> >>> latter, in particular, might be a useful guide for the<br>
>> >>> >>> >>> Infinispan<br>
>> >>> >>> >>> implementation. I believe it's mature enough to have solved<br>
>> >>> >>> >>> many<br>
>> >>> >>> >>> of<br>
>> >>> >>> >>> the most difficult problems of implementing Directory on a<br>
>> >>> >>> >>> distributed<br>
>> >>> >>> >>> Map.<br>
>> >>> >>> >>><br>
>> >>> >>> >>> If someone has any experience with Compass (particularly it's<br>
>> >>> >>> >>> Directory implementations) I'd be interested in hearing about<br>
>> >>> >>> >>> it...<br>
>> >>> >>> >>> It's Apache 2.0 licensed, btw.<br>
>> >>> >>> >>><br>
>> >>> >>> >>> -jeff<br>
>> >>> >>> >>> _______________________________________________<br>
>> >>> >>> >>> infinispan-dev mailing list<br>
>> >>> >>> >>> <a href="mailto:infinispan-dev@lists.jboss.org">infinispan-dev@lists.jboss.org</a><br>
>> >>> >>> >>> <a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" target="_blank">https://lists.jboss.org/mailman/listinfo/infinispan-dev</a><br>
>> >>> >>> >> _______________________________________________<br>
>> >>> >>> >> infinispan-dev mailing list<br>
>> >>> >>> >> <a href="mailto:infinispan-dev@lists.jboss.org">infinispan-dev@lists.jboss.org</a><br>
>> >>> >>> >> <a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" target="_blank">https://lists.jboss.org/mailman/listinfo/infinispan-dev</a><br>
>> >>> >>> >><br>
>> >>> >>> ><br>
>> >>> >>> > _______________________________________________<br>
>> >>> >>> > infinispan-dev mailing list<br>
>> >>> >>> > <a href="mailto:infinispan-dev@lists.jboss.org">infinispan-dev@lists.jboss.org</a><br>
>> >>> >>> > <a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" target="_blank">https://lists.jboss.org/mailman/listinfo/infinispan-dev</a><br>
>> >>> >>><br>
>> >>> >>> _______________________________________________<br>
>> >>> >>> infinispan-dev mailing list<br>
>> >>> >>> <a href="mailto:infinispan-dev@lists.jboss.org">infinispan-dev@lists.jboss.org</a><br>
>> >>> >>> <a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" target="_blank">https://lists.jboss.org/mailman/listinfo/infinispan-dev</a><br>
>> >>> ><br>
>> >>> ><br>
>> >><br>
>> >><br>
>> ><br>
>> ><br>
>> ><br>
><br>
><br>
</div></div></blockquote></div><br>