[infinispan-dev] [hibernate-dev] HSEARCH-1296

Mon Apr 15 06:30:36 EDT 2013

I've attached the logs the the JIRA.

Some replies inline:

On 15 April 2013 11:04, Dan Berindei <dan.berindei at gmail.com> wrote:
>
>
>
> On Sat, Apr 13, 2013 at 2:42 PM, Sanne Grinovero <sanne at infinispan.org>
> wrote:
>>
>> On 13 April 2013 11:20, Bela Ban <bban at redhat.com> wrote:
>> >
>> >
>> > On 4/13/13 2:02 AM, Sanne Grinovero wrote:
>> >
>> >> @All, the performance problem seemed to be caused by a problem in
>> >> JGroups, which I've logged here:
>> >> https://issues.jboss.org/browse/JGRP-1617
>> >
>> >
>> > Almost no information attached to the case :-( If it wasn't you, Sanne,
>> > I'd outright reject the case ...
>>
>> I wouldn't blame you, and am sorry for the lack of details: as I said
>> it was very late, still I preferred to share the observations we made
>> so far.
>>
>> >From all the experiments we made - and some good logs I'll cleanup for
>> sharing - it's clear that the thread is not woken up while the ACK was
>> already received.
>> And of course I wouldn't expect this to fail in a simple test as it
>> wouldn't have escaped you ;-) or at least you would have had earlier
>> reports.
>>
>> There are lots of complex moving parts in this scenario: from a Muxed
>> JGroups Channel, and the Application Server responsible for
>> initializing the stack with some added magic from CapeDwarf itself:
>> it's not clear to me what configuration is exactly being used, for
>> one.
>>
>
> Does CD also change the JGroups configuration? I thought it only tweaks the
> Infinispan cache configuration on deployment, and the JGroups channel is
> already started by the time the CD application is deployed.

CD uses a custom AS build and a custom AS configuration, so anything
could be different.
On top of that, some things are reconfigured programmatically by it.

>> Without a testcase we might not be 100% sure but it seems likely to be
>> an unexpected behaviour in JGroups, at least under some very specific
>> setup.
>>
>>
>> I'm glad to help tracking down more details of what could trigger
>> this, but I'm not too eager to write a full unit test for this as it
>> involves a lot of other components, and by mocking my own components
>> out I could still reproduce it: it's not Hibernate Search, so I'll
>> need the help from the field experts.
>>
>> Also I suspect a test would need to depend on many more components: is
>> JGroups having an easy way to manage dependencies nowadays?
>>
>> some more inline:
>>
>> >
>> > The MessageDispatcher will *not* wait until the timeout kicks in, it'll
>> > return as soon as it has acks from all members of the target set. This
>> > works and is covered with a bunch of unit tests, so a regression would
>> > have been caught immediately.
>>
>> I don't doubt the "vanilla scenario", but this is what happens in the
>> more complex case of the CapeDwarf setup.
>>
>
> My first guess would be that the MuxRpcDispatcher on the second node hasn't
> started yet by the time you call castMessage on the first node. It could be
> that your workaround just delayed the message a little bit, until the
> MuxRpcDispatcher on the other node actually started (because the JChannel is
> already started on both nodes, but as long as the MuxRpcDispatcher isn't
> started on the 2nd node it won't send any responses back).

Before the point in which Search uses the dispatcher, many more
operations happened succesfully and with a reasonable timing:
especially some transactions on Infinispan stored entries quickly and
without trouble.

Besides if such a race condition would be possible, I would consider
it a critical bug.

>> > I attached a test program to JGRP-1617 which shows that this feature
>> > works correctly.
>> >
>> > Of course, if you lose an ack (e.g. due to a maxed out incoming / OOB
>> > thread pool), the unicast protocol will have to retransmit the ack until
>> > it has been received. Depending on the unicast protocol you use, this
>> > will be immediate (UNICAST, UNICAST3), or based on a stability interval
>> > (UNICAST2).
>>
>> Right it's totally possible this is a stack configuration problem in the
>> AS.
>> I wouldn't be the best to ask that though, I don't even understand the
>> configuration format.
>>
>
> You can get the actual JGroups configuration with
> channel.getProtocolStack().printProtocolSpecAsXml(), but I wouldn't expect
> you to find any surprises there: they should use pretty much the JGroups
> defaults.

Nice tip. I'll add this as a logging option.

>
> By default STABLE.desired_avg_gossip is 20s and STABLE.stability_delay is
> 6s, so even if the message was lost it should take < 30s for the message to
> be resent.

The delay is actually ~10 seconds per RPC, so still <30s.
The reason the overall test takes 60 seconds is because there are 6
operations being performed.

>> >> For the record, the first operation was indeed triggering some lazy
>> >> initialization of indexes, which in turn would trigger a Lucene
>> >> Directory being started, triggering 3 Cache starts which in turn would
>> >> trigger 6 state transfer processes: so indeed the first operation
>> >> would not be exactly "cheap" performance wise, still this would
>> >> complete in about 120 milliseconds.
>> >
>> > This sounds very low for the work you describe above. I don't think 6
>> > state transfers can be completed in 120ms, unless they're async (but
>> > then that means they're not done when you return). Also, cache starts
>> > (wrt JGroups) will definitely take more than a few seconds if you're the
>> > first cluster node...
>>
>> It's a unit test: the caches are initially empty and networking is
>> loopback,
>> on the second round some ~6 elements are in the cache, no larger than
>> ~10 character strings.
>> Should be reasonable?
>>
>
> Yes, I think it's reasonable, if the JChannel was already started before the
> CD application was deployed. Starting the first JChannel would take at least
> 3s, which is the default PING.timeout.
>
>
>>
>> >> Not being sure about the options of depending to a newer JGroups
>> >> release or the complexity of a fix, I'll implement a workaround in
>> >> HSearch in the scope of HSEARCH-1296.
>> >
>> >
>> > If you add more information to JGRP-1617, I'll take a look. This would
>> > be a critical bug in JGroups *if* you can prove that the
>> > MessageDispatcher always runs into the timeout (I don't think you can
>> > though !).
>>
>> Considering the easy workaround and that definitely this needs
>> something special in the configuration, I wouldn't consider it too
>> critical? For as far as we know now, it's entirely possible the
>> configuration being used is illegal. But this is exactly where I need
>> your help ;-)
>>
>
> I'm not sure that your workaround is 100% effective, even if it doesn't
> happen in this test it's always possible to have the app deployed on some of
> the nodes in the cluster, but not all.

That's right. I would prefer not to apply anything like that, but for
the sake of the experiment it was usefull to isolate the problem.

Cheers,
Sanne