On Tue, Oct 2, 2012 at 6:29 PM, Bela Ban <bban@redhat.com> wrote:


On 10/2/12 1:43 PM, Dan Berindei wrote:

>>> test threads are stuck in JGroupsTransport.waitForChannelToConnect so it
>>> looks like a JGroups thing.
>>
>>
>> JGroupsTransport.waitForChannelToConnect() is Infinispan code, not
>> JGroups code. IMO unneeded code, too, as I don't understand why we're
>> waiting on a latch until we get a view after calling JChannel.connect().
>> When JChannel.connect() returns, you're guaranteed to have a view
>> installed, so this code can (should !) be removed, as it makes things
>> unnecessarily complex (and error prone ?).
>>
>
> True, it's Infinispan code - my code, actually ;)


ouch :-)


> The reason behind the latch was probably to avoid concurrent updates of the
> members list from the main thread and from the JGroups thread that calls viewAccepted().


The members should be updated in viewAccepted(). When you connect a
channel, then viewAccepted() will be called *before* JChannel.connect()
returns, so the main thread won't have a chance of accessing members
before that.


Ok, if viewAccepted() is guaranteed to be called before connect() returns then you're right, the latch is superfluous. We only need to call viewAccepted() ourselves if the channel was injected (which we already do).

In this particular case, however, we don't get a viewAccepted() call or an exception from connect(), so I'm not sure it would have been better (easier to debug) without the latch.


> I agree it's probably unnecessary, as we already use synchronization in our viewAccepted() implementation, but I don't see why it shouldn't work.

It's (unnecessarily) complex code that could be simplified; I'm not
suggesting it's incorrect. More complexity == more bugs.


True. I just wasn't sure whether viewAccepted() is guaranteed to be called during connect(), so I added a "safety net".

I know there was a NPE in JGroupsTransport that I thought at the time was caused by a missing viewAccepted() call, but I can't find the stack trace any more so I'm not sure if that was really the case.
 

> We have attached the JGroupsTransport as a membership listener to the
> MessageDispatcher (CommandAwareRpcDispatcher, actually) and we have
> attached the MessageDispatcher as a channel listener to the channel before
> calling connect(), so JGroups should call viewAccepted() for the initial
> view just as it does for every view.


You mean MembershipListener ? Yes, if you create the channel, then the
MessageDispatcher, then connect(), then you'll get the viewAccepted()
callback *before* JChannel.connect() returns.

Unless you deliver the view in a separate thread, but I don't think you
do this, right ?


I'm interested here strictly in the viewAccepted() callback in JGroupsTransport - I don't care how our other components that listen for our JGroups view events get their notifications, they usually all their version of viewAccepted() on startup.


Which version is the stacktrace below with ? 3.0.x ? Taking a look now...


3.2.0.Alpha2
 

> I just got the same test to hang on my machine, and I found this exception
> in the log:
>
> 12:26:25,098 DEBUG (CacheStarter-Cache4:nbst) [GMS]
> exception=java.lang.IllegalStateException: channel is not connected,
> retrying
> java.lang.IllegalStateException: channel is not connected
>      at
> org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.down(MessageDispatcher.java:621)
>      at
> org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:535)
>      at
> org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:390)
>      at
> org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:248)
>      at
> org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:604)
>      at org.jgroups.JChannel.up(JChannel.java:715)
>      at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1020)
>      at org.jgroups.protocols.FRAG2.up(FRAG2.java:181)
>      at org.jgroups.protocols.FC.up(FC.java:479)
>      at org.jgroups.protocols.pbcast.GMS.up(GMS.java:896)
>      at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:244)
>      at org.jgroups.protocols.UNICAST2.up(UNICAST2.java:432)
>      at org.jgroups.protocols.pbcast.NAKACK2.handleMessage(NAKACK2.java:722)
>      at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:570)
>      at
> org.jgroups.protocols.pbcast.NAKACK2.flushBecomeServerQueue(NAKACK2.java:841)
>      at org.jgroups.protocols.pbcast.NAKACK2.down(NAKACK2.java:490)
>      at org.jgroups.protocols.UNICAST2.down(UNICAST2.java:523)
>      at org.jgroups.protocols.pbcast.STABLE.down(STABLE.java:307)
>      at org.jgroups.protocols.pbcast.GMS.installView(GMS.java:637)
>      at
> org.jgroups.protocols.pbcast.ClientGmsImpl.installView(ClientGmsImpl.java:248)
>      at
> org.jgroups.protocols.pbcast.ClientGmsImpl.joinInternal(ClientGmsImpl.java:182)
>      at
> org.jgroups.protocols.pbcast.ClientGmsImpl.join(ClientGmsImpl.java:37)
>      at org.jgroups.protocols.pbcast.GMS.down(GMS.java:938)
>      at org.jgroups.protocols.FC.down(FC.java:435)
>      at org.jgroups.protocols.FRAG2.down(FRAG2.java:147)
>      at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:1025)
>      at org.jgroups.JChannel.down(JChannel.java:729)
>      at org.jgroups.JChannel.connect(JChannel.java:291)
>      at org.jgroups.JChannel.connect(JChannel.java:262)
>      at
> org.infinispan.remoting.transport.jgroups.JGroupsTransport.startJGroupsChannelIfNeeded(JGroupsTransport.java:206)
>      at
> org.infinispan.remoting.transport.jgroups.JGroupsTransport.start(JGroupsTransport.java:197)
>      at sun.reflect.GeneratedMethodAccessor113.invoke(Unknown Source)
>      at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>      at java.lang.reflect.Method.invoke(Method.java:601)
>      at
> org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:236)
>      at
> org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:900)
>      at
> org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:650)
>      at
> org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:639)
>      at
> org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:542)
>      at
> org.infinispan.factories.GlobalComponentRegistry.start(GlobalComponentRegistry.java:218)
>      at
> org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:680)
>      at
> org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:652)
>      at
> org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:548)
>      at
> org.infinispan.statetransfer.JoiningNode.getCache(JoiningNode.java:54)
>      at
> org.infinispan.statetransfer.StateTransferFunctionalTest$2.run(StateTransferFunctionalTest.java:234)
>      at java.lang.Thread.run(Thread.java:722)


--
Bela Ban, JGroups lead (http://www.jgroups.org)
_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev