[infinispan-dev] [hibernate-dev] HSEARCH-1296

Dan Berindei dan.berindei at gmail.com
Mon Apr 15 08:35:56 EDT 2013


On Mon, Apr 15, 2013 at 1:30 PM, Sanne Grinovero <sanne at infinispan.org>wrote:

> I've attached the logs the the JIRA.
>
> Some replies inline:
>
> On 15 April 2013 11:04, Dan Berindei <dan.berindei at gmail.com> wrote:
> >
> >
> >
> > On Sat, Apr 13, 2013 at 2:42 PM, Sanne Grinovero <sanne at infinispan.org>
> > wrote:
> >>
> >> On 13 April 2013 11:20, Bela Ban <bban at redhat.com> wrote:
> >> >
> >> >
> >> > On 4/13/13 2:02 AM, Sanne Grinovero wrote:
> >> >
> >> >> @All, the performance problem seemed to be caused by a problem in
> >> >> JGroups, which I've logged here:
> >> >> https://issues.jboss.org/browse/JGRP-1617
> >> >
> >> >
> >> > Almost no information attached to the case :-( If it wasn't you,
> Sanne,
> >> > I'd outright reject the case ...
> >>
> >> I wouldn't blame you, and am sorry for the lack of details: as I said
> >> it was very late, still I preferred to share the observations we made
> >> so far.
> >>
> >> >From all the experiments we made - and some good logs I'll cleanup for
> >> sharing - it's clear that the thread is not woken up while the ACK was
> >> already received.
> >> And of course I wouldn't expect this to fail in a simple test as it
> >> wouldn't have escaped you ;-) or at least you would have had earlier
> >> reports.
> >>
> >> There are lots of complex moving parts in this scenario: from a Muxed
> >> JGroups Channel, and the Application Server responsible for
> >> initializing the stack with some added magic from CapeDwarf itself:
> >> it's not clear to me what configuration is exactly being used, for
> >> one.
> >>
> >
> > Does CD also change the JGroups configuration? I thought it only tweaks
> the
> > Infinispan cache configuration on deployment, and the JGroups channel is
> > already started by the time the CD application is deployed.
>
> CD uses a custom AS build and a custom AS configuration, so anything
> could be different.
> On top of that, some things are reconfigured programmatically by it.
>
>
Ales already cleared this out, CD doesn't change the JGroups config at all.


> >> Without a testcase we might not be 100% sure but it seems likely to be
> >> an unexpected behaviour in JGroups, at least under some very specific
> >> setup.
> >>
> >>
> >> I'm glad to help tracking down more details of what could trigger
> >> this, but I'm not too eager to write a full unit test for this as it
> >> involves a lot of other components, and by mocking my own components
> >> out I could still reproduce it: it's not Hibernate Search, so I'll
> >> need the help from the field experts.
> >>
> >> Also I suspect a test would need to depend on many more components: is
> >> JGroups having an easy way to manage dependencies nowadays?
> >>
> >> some more inline:
> >>
> >> >
> >> > The MessageDispatcher will *not* wait until the timeout kicks in,
> it'll
> >> > return as soon as it has acks from all members of the target set. This
> >> > works and is covered with a bunch of unit tests, so a regression would
> >> > have been caught immediately.
> >>
> >> I don't doubt the "vanilla scenario", but this is what happens in the
> >> more complex case of the CapeDwarf setup.
> >>
> >
> > My first guess would be that the MuxRpcDispatcher on the second node
> hasn't
> > started yet by the time you call castMessage on the first node. It could
> be
> > that your workaround just delayed the message a little bit, until the
> > MuxRpcDispatcher on the other node actually started (because the
> JChannel is
> > already started on both nodes, but as long as the MuxRpcDispatcher isn't
> > started on the 2nd node it won't send any responses back).
>
> Before the point in which Search uses the dispatcher, many more
> operations happened succesfully and with a reasonable timing:
> especially some transactions on Infinispan stored entries quickly and
> without trouble.
>
> Besides if such a race condition would be possible, I would consider
> it a critical bug.
>
>
I looked at the muxer code and I think they actually take care of this
already: MuxUpHandler returns a NoMuxHandler response when it can't find an
appropriate MuxedRpcDispatcher, and MessageDispatcher counts that response
against the number of expected responses. So it must be something else...



>
> >> > I attached a test program to JGRP-1617 which shows that this feature
> >> > works correctly.
> >> >
> >> > Of course, if you lose an ack (e.g. due to a maxed out incoming / OOB
> >> > thread pool), the unicast protocol will have to retransmit the ack
> until
> >> > it has been received. Depending on the unicast protocol you use, this
> >> > will be immediate (UNICAST, UNICAST3), or based on a stability
> interval
> >> > (UNICAST2).
> >>
> >> Right it's totally possible this is a stack configuration problem in the
> >> AS.
> >> I wouldn't be the best to ask that though, I don't even understand the
> >> configuration format.
> >>
> >
> > You can get the actual JGroups configuration with
> > channel.getProtocolStack().printProtocolSpecAsXml(), but I wouldn't
> expect
> > you to find any surprises there: they should use pretty much the JGroups
> > defaults.
>
> Nice tip. I'll add this as a logging option.
>
> >
> > By default STABLE.desired_avg_gossip is 20s and STABLE.stability_delay is
> > 6s, so even if the message was lost it should take < 30s for the message
> to
> > be resent.
>
> The delay is actually ~10 seconds per RPC, so still <30s.
> The reason the overall test takes 60 seconds is because there are 6
> operations being performed.
>
>
Ok, this means my hunch definitely wasn't true, that would have explained
only the first request timing out.



>
> >> >> For the record, the first operation was indeed triggering some lazy
> >> >> initialization of indexes, which in turn would trigger a Lucene
> >> >> Directory being started, triggering 3 Cache starts which in turn
> would
> >> >> trigger 6 state transfer processes: so indeed the first operation
> >> >> would not be exactly "cheap" performance wise, still this would
> >> >> complete in about 120 milliseconds.
> >> >
> >> > This sounds very low for the work you describe above. I don't think 6
> >> > state transfers can be completed in 120ms, unless they're async (but
> >> > then that means they're not done when you return). Also, cache starts
> >> > (wrt JGroups) will definitely take more than a few seconds if you're
> the
> >> > first cluster node...
> >>
> >> It's a unit test: the caches are initially empty and networking is
> >> loopback,
> >> on the second round some ~6 elements are in the cache, no larger than
> >> ~10 character strings.
> >> Should be reasonable?
> >>
> >
> > Yes, I think it's reasonable, if the JChannel was already started before
> the
> > CD application was deployed. Starting the first JChannel would take at
> least
> > 3s, which is the default PING.timeout.
> >
> >
> >>
> >> >> Not being sure about the options of depending to a newer JGroups
> >> >> release or the complexity of a fix, I'll implement a workaround in
> >> >> HSearch in the scope of HSEARCH-1296.
> >> >
> >> >
> >> > If you add more information to JGRP-1617, I'll take a look. This would
> >> > be a critical bug in JGroups *if* you can prove that the
> >> > MessageDispatcher always runs into the timeout (I don't think you can
> >> > though !).
> >>
> >> Considering the easy workaround and that definitely this needs
> >> something special in the configuration, I wouldn't consider it too
> >> critical? For as far as we know now, it's entirely possible the
> >> configuration being used is illegal. But this is exactly where I need
> >> your help ;-)
> >>
> >
> > I'm not sure that your workaround is 100% effective, even if it doesn't
> > happen in this test it's always possible to have the app deployed on
> some of
> > the nodes in the cluster, but not all.
>
> That's right. I would prefer not to apply anything like that, but for
> the sake of the experiment it was usefull to isolate the problem.
>
>
Looking at your workaround, I think you actually set the response mode to
GET_NONE (because that's the default value in RequestOptions), so you're
back to sending an asynchronous request.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20130415/faa23ca4/attachment-0001.html 


More information about the infinispan-dev mailing list