[infinispan-dev] Alert from a failing test

Galder Zamarreño galder at redhat.com
Tue Jun 28 07:05:17 EDT 2011


Some comments below:

On Jun 21, 2011, at 10:26 AM, Dan Berindei wrote:

> On Mon, Jun 20, 2011 at 11:42 PM, Sanne Grinovero <sanne at infinispan.org> wrote:
>> 2011/6/20 Manik Surtani <manik at jboss.org>:
>>> Oddly enough, I don't see any other tests exhibiting this behaviour.  Let me know if you see it in more recent CI runs, and we'll investigate in detail.
>> 
>> In fact there are not many tests in core which verify a full stream is
>> received; but as in another thread I mentioned I was seeing the
>> following exception relatively often (it never caught my attention
>> some months ago)
>> 
>> Caused by: java.io.EOFException: The stream ended unexpectedly.
>> Please check whether the source of the stream encountered any issues
>> generating the stream.
>>        at org.infinispan.marshall.VersionAwareMarshaller.objectFromObjectStream(VersionAwareMarshaller.java:193)
>>        at org.infinispan.statetransfer.StateTransferManagerImpl.processCommitLog(StateTransferManagerImpl.java:218)
>>        at org.infinispan.statetransfer.StateTransferManagerImpl.applyTransactionLog(StateTransferManagerImpl.java:245)
>>        at org.infinispan.statetransfer.StateTransferManagerImpl.applyState(StateTransferManagerImpl.java:284)
>>        ... 27 more
>> Caused by: java.io.EOFException: Read past end of file
>>        at org.jboss.marshalling.SimpleDataInput.eofOnRead(SimpleDataInput.java:126)
>>        at org.jboss.marshalling.SimpleDataInput.readUnsignedByteDirect(SimpleDataInput.java:263)
>>        at org.jboss.marshalling.SimpleDataInput.readUnsignedByte(SimpleDataInput.java:224)
>>        at org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:209)
>>        at org.jboss.marshalling.AbstractObjectInput.readObject(AbstractObjectInput.java:37)
>>        at org.infinispan.marshall.jboss.GenericJBossMarshaller.objectFromObjectStream(GenericJBossMarshaller.java:191)
>>        at org.infinispan.marshall.VersionAwareMarshaller.objectFromObjectStream(VersionAwareMarshaller.java:191)
>>        ... 30 more
>> 
> 
> The line "at org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:209)"
> suggests EOF has been reached while reading the lead byte of the
> object, not a partial object. This is consistent with
> StateTransferManagerImpl.generateTransactionLog() getting a timeout
> while trying to acquire the processing lock (at
> StateTransferManagerImpl.java:192) and closing the stream in the
> middle of the transaction log, not a transmission error. We could
> probably get rid of the exception in the logs by inserting another
> delimiter here.

You mean inserting a particular delimiter in case of a timeout acquiring the processing lock? That sounds like a good idea. Just back from holidays and can't remember well, but where previous EOFs related to processing lock acquisition? If so, I think your idea makes even more sense cos it'd fall into a possible known problem and could provide the receiver side with that bit more of information.

> 
> Back to the original problem, if this was a stream corruption issue
> I'd expect a lot more instances of deserializing errors because the
> length of the buffer was smaller/larger than the number of bytes
> following it and the next object to be deserialized from the stream
> found garbage instead.
> 
> This looks to me more like an index segment has been created with size
> x on node A and also on node B, then it was updated with size y > x on
> node A but only the metadata got to node B, the segment's byte array
> remained the same.
> 
> I don't know anything about the Lucene directory implementation yet,
> so I have no idea if/how this could happen and I haven't been able to
> reproduce it on my machine. Is there a way to see the Jenkins test
> logs?

There's the console log in http://goo.gl/JFh5R but if this happens relatively often in the Lucene dir impl, we could create a small jenkins run and pass a -Dlog4j.configuration configuring TRACE to be printed on console. The testsuite is small and would not generate a lot logging. There's always a change of not encountering the issue when TRACE is enabled, particularly if it's a race condition, but I think it's worth doing. IOW, I can set it up.

> 
> Dan
> 
> 
>> This looks like a suspicious correlation to me, as I think the
>> reported errors are similar in nature.
>> 
>> Cheers,
>> Sanne
>> 
>> 
>> 
>>> 
>>> On 18 Jun 2011, at 20:18, Sanne Grinovero wrote:
>>> 
>>>> Hello all,
>>>> I'm not in state to fully debug the issue this week, but even though
>>>> this failure happens in the Lucene Directory it looks like it's
>>>> reporting an issue with Infinispan core:
>>>> 
>>>> https://infinispan.ci.cloudbees.com/job/Infinispan-master-JDK6-tcp/90/org.infinispan$infinispan-lucene-directory/testReport/junit/org.infinispan.lucene/SimpleLuceneTest/org_infinispan_lucene_SimpleLuceneTest_testIndexWritingAndFinding/
>>>> 
>>>> In this test we're writing to the index, and then asserting on the
>>>> expected state on both nodes, but while it is successful on the same
>>>> node as the writer, it fails with
>>>> "java.io.IOException: Read past EOF" on the second node.
>>>> 
>>>> This exception can mean only one thing: the value, which is a
>>>> buffer[], was not completely transferred to the second node, which
>>>> seems quite critical as the caches are using sync.
>>>> I can't reproduce the error locally, but it's not the first time it is
>>>> reported by CI: builds 60, 62, 65 for example (and more) show the same
>>>> testcase fail in the same manner.
>>>> 
>>>> Cheers,
>>>> Sanne
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> 
>>> --
>>> Manik Surtani
>>> manik at jboss.org
>>> twitter.com/maniksurtani
>>> 
>>> Lead, Infinispan
>>> http://www.infinispan.org
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> 
>> 
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache




More information about the infinispan-dev mailing list