<div class="gmail_quote">On Tue, Oct 2, 2012 at 12:41 PM, Bela Ban <span dir="ltr"><<a href="mailto:bban@redhat.com" target="_blank">bban@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im"><br>
<br>
On 10/2/12 11:27 AM, Dan Berindei wrote:<br>
> Hey Sanne<br>
><br>
> Funny, Tristan just pinged me about the same problem in<br>
> StateTransferFunctionalTest...<br>
><br>
> I haven't reproduced it yet, but it could be because I don't have the very<br>
> latest updates. At first I thought it was related to<br>
> <a href="https://issues.jboss.org/browse/ISPN-2357" target="_blank">https://issues.jboss.org/browse/ISPN-2357</a>, but then in your thread dump all<br>
> test threads are stuck in JGroupsTransport.waitForChannelToConnect so it<br>
> looks like a JGroups thing.<br>
<br>
<br>
</div>JGroupsTransport.waitForChannelToConnect() is Infinispan code, not<br>
JGroups code. IMO unneeded code, too, as I don't understand why we're<br>
waiting on a latch until we get a view after calling JChannel.connect().<br>
When JChannel.connect() returns, you're guaranteed to have a view<br>
installed, so this code can (should !) be removed, as it makes things<br>
unnecessarily complex (and error prone ?).<br>
<span class="HOEnZb"></span></blockquote><div><br>True, it's Infinispan code - my code, actually ;)<br><br>The reason behind the latch was probably to avoid concurrent updates of the members list from the main thread and from the JGroups thread that calls viewAccepted(). I agree it's probably unnecessary, as we already use synchronization in our viewAccepted() implementation, but I don't see why it shouldn't work.<br>
<br>We have attached the JGroupsTransport as a membership listener to the MessageDispatcher (CommandAwareRpcDispatcher, actually) and we have attached the MessageDispatcher as a channel listener to the channel before calling connect(), so JGroups should call viewAccepted() for the initial view just as it does for every view.<br>
<br>I just got the same test to hang on my machine, and I found this exception in the log:<br><br>12:26:25,098 DEBUG (CacheStarter-Cache4:nbst) [GMS] exception=java.lang.IllegalStateException: channel is not connected, retrying<br>
java.lang.IllegalStateException: channel is not connected<br> at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.down(MessageDispatcher.java:621)<br> at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:535)<br>
at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:390)<br> at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:248)<br> at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:604)<br>
at org.jgroups.JChannel.up(JChannel.java:715)<br> at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1020)<br> at org.jgroups.protocols.FRAG2.up(FRAG2.java:181)<br> at org.jgroups.protocols.FC.up(FC.java:479)<br>
at org.jgroups.protocols.pbcast.GMS.up(GMS.java:896)<br> at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:244)<br> at org.jgroups.protocols.UNICAST2.up(UNICAST2.java:432)<br> at org.jgroups.protocols.pbcast.NAKACK2.handleMessage(NAKACK2.java:722)<br>
at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:570)<br> at org.jgroups.protocols.pbcast.NAKACK2.flushBecomeServerQueue(NAKACK2.java:841)<br> at org.jgroups.protocols.pbcast.NAKACK2.down(NAKACK2.java:490)<br>
at org.jgroups.protocols.UNICAST2.down(UNICAST2.java:523)<br> at org.jgroups.protocols.pbcast.STABLE.down(STABLE.java:307)<br> at org.jgroups.protocols.pbcast.GMS.installView(GMS.java:637)<br> at org.jgroups.protocols.pbcast.ClientGmsImpl.installView(ClientGmsImpl.java:248)<br>
at org.jgroups.protocols.pbcast.ClientGmsImpl.joinInternal(ClientGmsImpl.java:182)<br> at org.jgroups.protocols.pbcast.ClientGmsImpl.join(ClientGmsImpl.java:37)<br> at org.jgroups.protocols.pbcast.GMS.down(GMS.java:938)<br>
at org.jgroups.protocols.FC.down(FC.java:435)<br> at org.jgroups.protocols.FRAG2.down(FRAG2.java:147)<br> at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:1025)<br> at org.jgroups.JChannel.down(JChannel.java:729)<br>
at org.jgroups.JChannel.connect(JChannel.java:291)<br> at org.jgroups.JChannel.connect(JChannel.java:262)<br> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.startJGroupsChannelIfNeeded(JGroupsTransport.java:206)<br>
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.start(JGroupsTransport.java:197)<br> at sun.reflect.GeneratedMethodAccessor113.invoke(Unknown Source)<br> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)<br>
at java.lang.reflect.Method.invoke(Method.java:601)<br> at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:236)<br> at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:900)<br>
at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:650)<br> at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:639)<br>
at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:542)<br> at org.infinispan.factories.GlobalComponentRegistry.start(GlobalComponentRegistry.java:218)<br> at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:680)<br>
at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:652)<br> at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:548)<br> at org.infinispan.statetransfer.JoiningNode.getCache(JoiningNode.java:54)<br>
at org.infinispan.statetransfer.StateTransferFunctionalTest$2.run(StateTransferFunctionalTest.java:234)<br> at java.lang.Thread.run(Thread.java:722)<br><br><br>Cheers<br>Dan<br><br></div></div>