[jboss-dev] Flush failed is back in jgroups
Brian Stansberry
brian.stansberry at redhat.com
Wed Mar 19 11:43:57 EDT 2008
The JGroups changes I was talking about didn't happen; code hasn't
changed for months.
Adrian, any more info you can provide on what you're seeing and you're
environment? The "GMS: address is 127.0.0.1:32774" logging in your
initial post tells me the correct jgroups.bind_addr value is getting to
JGroups.
Brian Stansberry wrote:
> org.jboss.Main sets the jgroups.bind_addr system property to 127.0.0.1
> by default. (I think you did that. ;))
>
> There have been some changes in the way JGroups creates sockets (to
> avoid around the problem discussed at
> http://wiki.jboss.org/wiki/Wiki.jsp?page=PromiscuousTraffic). I'll have
> a look to see if somehow that's causing the jgroups.bind_addr setting to
> be bypassed.
>
> Adrian Brock wrote:
>> The current jgroups is trying to send udp requests over by nic (which
>> doesn't work because it isn't up).
>>
>> It should be bound to only "localhost" by default.
>>
>> On Tue, 2008-03-18 at 17:18 -0500, Brian Stansberry wrote:
>>> Adrian: Is multicast working on your loopback interface? The
>>> problems are symptoms of the channel not receiving back its own
>>> messages.
>>>
>>> I'm going to change the AS's JGroups configs to loopback their own
>>> messages to themselves rather than relying on getting them back from
>>> the network. http://jira.jboss.com/jira/browse/JBAS-5323 That will
>>> make this strange behavior go away (although the node of course will
>>> not be able to communicate with other nodes).
>>>
>>> Clebert: the NPE here is because with multicast not working, the
>>> MuxChannel is not connecting correctly. So you don't get the view.
>>> It's basically a broken channel; IMO connect() should not have
>>> returned properly but rather a ChannelException should have been
>>> thrown. This is an example of why we want to move away from the
>>> multiplexer and toward shared transport channels. The code that
>>> prevented the view being received is part of the multiplexer layer
>>> that will no longer be used in the AS.
>>>
>>> Clebert Suconic wrote:
>>>> >
>>>> > I notice the NPE still exists in JBoss Messaging as well.
>>>>
>>>>
>>>> I believe the NPE solved you're mentioning is another NPE, in
>>>> another case.
>>>>
>>>>
>>>> I am currently dealing with this exception in our testsuite, as we
>>>> are updating JGroups at our development branch. (STABLE)
>>>>
>>>> There is some race condition between channel.connect() and when the
>>>> View is sent. With JGroups 2.4, and the stack we are using at JBM,
>>>> the view would aways be sent while channel.connect() is called
>>>> (synchronized). It looks like this has changed on 2.6. (Probably
>>>> another thread sending the view... or the Stack is different).
>>>>
>>>> On my tests I assumed this was because we didn't have the flush
>>>> protocol, but i can see that Flush is being used at the Multiplexor
>>>> channels.
>>>>
>>>> I am doing some debug right now, and I will update it as soon as we
>>>> have solved this.
>>>>
>>>>
>>>> Clebert
>>>>
>>>> Adrian Brock wrote:
>>>>> I thought this had been fixed?
>>>>>
>>>>> I notice the NPE still exists in JBoss Messaging as well.
>>>>>
>>>>>
>>>>> 13:47:12,943 INFO [STDOUT]
>>>>> -------------------------------------------------------
>>>>> GMS: address is 127.0.0.1:32774
>>>>> -------------------------------------------------------
>>>>> 13:47:27,515 WARN [MuxChannel] Flush failed at
>>>>> 127.0.0.1:32774:DefaultPartition-JMS-CTRL
>>>>> 13:47:29,519 WARN [Multiplexer] failed to collect all service ACKs
>>>>> (1)
>>>>> for [dst: <null>, src: 127.0.0.1:32774 (4 headers), size=0 bytes]
>>>>> after
>>>>> 2000ms, missing ACKs from [127.0.0.1:32774] (received=[]),
>>>>> local_addr=127.0.0.1:32774
>>>>> 13:47:34,520 WARN [JChannel] Timeout waiting for UNBLOCK event at
>>>>> 127.0.0.1:32774
>>>>> 13:47:34,525 ERROR [ExceptionUtil]
>>>>> org.jboss.messaging.core.jmx.MessagingPostOfficeService at 197e6dc
>>>>> startService
>>>>> java.lang.NullPointerException
>>>>> at
>>>>> org.jboss.messaging.core.impl.postoffice.GroupMember.start(GroupMember.java:160)
>>>>>
>>>>> at
>>>>> org.jboss.messaging.core.impl.postoffice.MessagingPostOffice.start(MessagingPostOffice.java:347)
>>>>>
>>>>> at
>>>>> org.jboss.messaging.core.jmx.MessagingPostOfficeService.startService(MessagingPostOfficeService.java:427)
>>>>>
>>>>> at
>>>>> org.jboss.system.ServiceMBeanSupport.jbossInternalStart(ServiceMBeanSupport.java:299)
>>>>>
>>>>> at
>>>>> org.jboss.system.ServiceMBeanSupport.start(ServiceMBeanSupport.java:196)
>>>>>
>>>>>
>>>> _______________________________________________
>>>> jboss-development mailing list
>>>> jboss-development at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/jboss-development
>
--
Brian Stansberry
Lead, AS Clustering
JBoss, a division of Red Hat
brian.stansberry at redhat.com
More information about the jboss-development
mailing list