<div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote">On Sat, Apr 13, 2013 at 2:42 PM, Sanne Grinovero <span dir="ltr">&lt;<a href="mailto:sanne@infinispan.org" target="_blank">sanne@infinispan.org</a>&gt;</span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="im">On 13 April 2013 11:20, Bela Ban &lt;<a href="mailto:bban@redhat.com">bban@redhat.com</a>&gt; wrote:<br>


&gt;<br>

&gt;<br>

&gt; On 4/13/13 2:02 AM, Sanne Grinovero wrote:<br>

&gt;<br>

&gt;&gt; @All, the performance problem seemed to be caused by a problem in<br>

&gt;&gt; JGroups, which I&#39;ve logged here:<br>

&gt;&gt; <a href="https://issues.jboss.org/browse/JGRP-1617" target="_blank">https://issues.jboss.org/browse/JGRP-1617</a><br>

&gt;<br>

&gt;<br>

&gt; Almost no information attached to the case :-( If it wasn&#39;t you, Sanne,<br>

&gt; I&#39;d outright reject the case ...<br>

<br>

</div>I wouldn&#39;t blame you, and am sorry for the lack of details: as I said<br>

it was very late, still I preferred to share the observations we made<br>

so far.<br>

<br>

&gt;From all the experiments we made - and some good logs I&#39;ll cleanup for<br>

sharing - it&#39;s clear that the thread is not woken up while the ACK was<br>

already received.<br>

And of course I wouldn&#39;t expect this to fail in a simple test as it<br>

wouldn&#39;t have escaped you ;-) or at least you would have had earlier<br>

reports.<br>

<br>

There are lots of complex moving parts in this scenario: from a Muxed<br>

JGroups Channel, and the Application Server responsible for<br>

initializing the stack with some added magic from CapeDwarf itself:<br>

it&#39;s not clear to me what configuration is exactly being used, for<br>

one.<br>

<br></blockquote><div><br></div><div>Does CD also change the JGroups configuration? I thought it only tweaks the Infinispan cache configuration on deployment, and the JGroups channel is already started by the time the CD application is deployed.<br>


<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

Without a testcase we might not be 100% sure but it seems likely to be<br>

an unexpected behaviour in JGroups, at least under some very specific<br>

setup.<br> 

<br></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

I&#39;m glad to help tracking down more details of what could trigger<br>

this, but I&#39;m not too eager to write a full unit test for this as it<br>

involves a lot of other components, and by mocking my own components<br>

out I could still reproduce it: it&#39;s not Hibernate Search, so I&#39;ll<br>

need the help from the field experts.<br>

<br>

Also I suspect a test would need to depend on many more components: is<br>

JGroups having an easy way to manage dependencies nowadays?<br>

<br></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

some more inline:<br>

<div class="im"><br>

&gt;<br>

&gt; The MessageDispatcher will *not* wait until the timeout kicks in, it&#39;ll<br>

&gt; return as soon as it has acks from all members of the target set. This<br>

&gt; works and is covered with a bunch of unit tests, so a regression would<br>

&gt; have been caught immediately.<br>

<br>

</div>I don&#39;t doubt the &quot;vanilla scenario&quot;, but this is what happens in the<br>

more complex case of the CapeDwarf setup.<br>

<div class="im"><br></div></blockquote><div><br></div><div>My first guess would be that the MuxRpcDispatcher on the second node hasn&#39;t started yet by the time you call castMessage on the first node. It could be that your workaround just delayed the message a little bit, until the MuxRpcDispatcher on the other node actually started (because the JChannel is already started on both nodes, but as long as the MuxRpcDispatcher isn&#39;t started on the 2nd node it won&#39;t send any responses back).<br>


<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="im">

&gt;<br>

&gt; I attached a test program to JGRP-1617 which shows that this feature<br>

&gt; works correctly.<br>

&gt;<br>

&gt; Of course, if you lose an ack (e.g. due to a maxed out incoming / OOB<br>

&gt; thread pool), the unicast protocol will have to retransmit the ack until<br>

&gt; it has been received. Depending on the unicast protocol you use, this<br>

&gt; will be immediate (UNICAST, UNICAST3), or based on a stability interval<br>

&gt; (UNICAST2).<br>

<br>

</div>Right it&#39;s totally possible this is a stack configuration problem in the AS.<br>

I wouldn&#39;t be the best to ask that though, I don&#39;t even understand the<br>

configuration format.<br>

<div class="im"><br></div></blockquote><div><br></div><div>You can get the actual JGroups configuration with channel.getProtocolStack().printProtocolSpecAsXml(), but I wouldn&#39;t expect you to find any surprises there: they should use pretty much the JGroups defaults.<br>


<br></div><div>By default STABLE.desired_avg_gossip is 20s and STABLE.stability_delay is 6s, so even if the message was lost it should take &lt; 30s for the message to be resent.<br><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">


<div class="im">

<br>

&gt;&gt; For the record, the first operation was indeed triggering some lazy<br>

&gt;&gt; initialization of indexes, which in turn would trigger a Lucene<br>

&gt;&gt; Directory being started, triggering 3 Cache starts which in turn would<br>

&gt;&gt; trigger 6 state transfer processes: so indeed the first operation<br>

&gt;&gt; would not be exactly &quot;cheap&quot; performance wise, still this would<br>

&gt;&gt; complete in about 120 milliseconds.<br>

&gt;<br>

&gt; This sounds very low for the work you describe above. I don&#39;t think 6<br>

&gt; state transfers can be completed in 120ms, unless they&#39;re async (but<br>

&gt; then that means they&#39;re not done when you return). Also, cache starts<br>

&gt; (wrt JGroups) will definitely take more than a few seconds if you&#39;re the<br>

&gt; first cluster node...<br>

<br>

</div>It&#39;s a unit test: the caches are initially empty and networking is loopback,<br>

on the second round some ~6 elements are in the cache, no larger than<br>

~10 character strings.<br>

Should be reasonable?<br>

<div class="im"><br></div></blockquote><div><br></div><div>Yes, I think it&#39;s reasonable, if the JChannel was already started before the CD application was deployed. Starting the first JChannel would take at least 3s, which is the default PING.timeout.<br>


<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="im">

&gt;&gt; Not being sure about the options of depending to a newer JGroups<br>

&gt;&gt; release or the complexity of a fix, I&#39;ll implement a workaround in<br>

&gt;&gt; HSearch in the scope of HSEARCH-1296.<br>

&gt;<br>

&gt;<br>

&gt; If you add more information to JGRP-1617, I&#39;ll take a look. This would<br>

&gt; be a critical bug in JGroups *if* you can prove that the<br>

&gt; MessageDispatcher always runs into the timeout (I don&#39;t think you can<br>

&gt; though !).<br>

<br>

</div>Considering the easy workaround and that definitely this needs<br>

something special in the configuration, I wouldn&#39;t consider it too<br>

critical? For as far as we know now, it&#39;s entirely possible the<br>

configuration being used is illegal. But this is exactly where I need<br>

your help ;-)<br>

<div class="im"><br></div></blockquote><div><br></div><div>I&#39;m not sure that your workaround is 100% effective, even if it doesn&#39;t happen in this test it&#39;s always possible to have the app deployed on some of the nodes in the cluster, but not all.<br>


<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="im">

<br>

&gt;&gt; As a lesson learned, I think we need to polish some of our TRACE level<br>

&gt;&gt; messaged to include the cache name: to resolve this we had not just<br>

&gt;&gt; many threads and components but also 4 of them where using JGroups<br>

&gt;&gt; (interleaving messages of all sorts) and 9 different caches where<br>

&gt;&gt; involved for each simple write operation in CD: made it interesting to<br>

&gt;&gt; figure what was going on!<br>

&gt;<br>

&gt; Yes, that would help. In JGroups, I usually log the cluster address of<br>

&gt; the thread that&#39;s writing to the log, so I can differentiate between<br>

&gt; different clusters on the same host.<br>

<br>

</div>Right, thread names and message correlation ids have been extremely useful,<br>

it&#39;s mostly the multiple Infinispan - caches without a name statement<br>

which should be improved as often the same user thread would invoke a<br>

sequence of operations on different caches, so same thread and same<br>

cluster name.. to make things funnier this was interleaving messages<br>

from non-transactional caches with optimistic and pessimistic<br>

messages, even Mircea got confused on WTF was going on at some point<br>

:-)<br></blockquote><div><br></div><div>That&#39;s only because he forgot to change the appender&#39;s pattern layout :)<br><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">


<div class="im"><br>

&gt;<br>

&gt;&gt; Also I&#39;m wondering how hard it would be to<br>

&gt;&gt; have a log parser which converts my 10GB of text log from today in a<br>

&gt;&gt; graphical sequence diagram.<br>

&gt;<br>

&gt; Yes, something like wireshark &quot;follow TCP&quot; feature would be very helpful !<br>

<br>

</div>+100<br>

More on that, a basic initial step would be something which filtered<br>

out all non-related information; for example an option &quot;hide/show<br>

stable messages&quot;, etc..<br>

<br>

On the testcase again: glad to help if I can but please don&#39;t forget<br>

this is by far not my field nor my priority, in working hours at least<br>

;-)<br>

<br>

Cheers,<br>

Sanne<br>

<div class=""><div class="h5"><br>

&gt;<br>

&gt; --<br>

&gt; Bela Ban, JGroups lead (<a href="http://www.jgroups.org" target="_blank">http://www.jgroups.org</a>)<br>

&gt; _______________________________________________<br>

&gt; infinispan-dev mailing list<br>

&gt; <a href="mailto:infinispan-dev@lists.jboss.org">infinispan-dev@lists.jboss.org</a><br>

&gt; <a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" target="_blank">https://lists.jboss.org/mailman/listinfo/infinispan-dev</a><br>

_______________________________________________<br>

infinispan-dev mailing list<br>

<a href="mailto:infinispan-dev@lists.jboss.org">infinispan-dev@lists.jboss.org</a><br>

<a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" target="_blank">https://lists.jboss.org/mailman/listinfo/infinispan-dev</a><br>

</div></div></blockquote></div><br></div></div>