[jboss-dev] Flush failed is back in jgroups

Brian Stansberry brian.stansberry at redhat.com
Wed Mar 19 11:43:57 EDT 2008


The JGroups changes I was talking about didn't happen; code hasn't 
changed for months.

Adrian, any more info you can provide on what you're seeing and you're 
environment?  The "GMS: address is 127.0.0.1:32774" logging in your 
initial post tells me the correct jgroups.bind_addr value is getting to 
JGroups.

Brian Stansberry wrote:
> org.jboss.Main sets the jgroups.bind_addr system property to 127.0.0.1 
> by default. (I think you did that. ;))
> 
> There have been some changes in the way JGroups creates sockets (to 
> avoid around the problem discussed at 
> http://wiki.jboss.org/wiki/Wiki.jsp?page=PromiscuousTraffic).  I'll have 
> a look to see if somehow that's causing the jgroups.bind_addr setting to 
> be bypassed.
> 
> Adrian Brock wrote:
>> The current jgroups is trying to send udp requests over by nic (which 
>> doesn't work because it isn't up).
>>
>> It should be bound to only "localhost" by default.
>>
>> On Tue, 2008-03-18 at 17:18 -0500, Brian Stansberry wrote:
>>> Adrian: Is multicast working on your loopback interface?  The 
>>> problems are symptoms of the channel not receiving back its own 
>>> messages.
>>>
>>> I'm going to change the AS's JGroups configs to loopback their own 
>>> messages to themselves rather than relying on getting them back from 
>>> the network.  http://jira.jboss.com/jira/browse/JBAS-5323  That will 
>>> make this strange behavior go away (although the node of course will 
>>> not be able to communicate with other nodes).
>>>
>>> Clebert: the NPE here is because with multicast not working, the 
>>> MuxChannel is not connecting correctly. So you don't get the view. 
>>> It's basically a broken channel; IMO connect() should not have 
>>> returned properly but rather a ChannelException should have been 
>>> thrown. This is an example of why we want to move away from the 
>>> multiplexer and toward shared transport channels. The code that 
>>> prevented the view being received is part of the multiplexer layer 
>>> that will no longer be used in the AS.
>>>
>>> Clebert Suconic wrote:
>>>>  >
>>>>  > I notice the NPE still exists in JBoss Messaging as well.
>>>>
>>>>
>>>> I believe the NPE solved you're mentioning is another NPE, in 
>>>> another case.
>>>>
>>>>
>>>> I am currently dealing with this exception in our testsuite, as we 
>>>> are updating JGroups at our development branch. (STABLE)
>>>>
>>>> There is some race condition between channel.connect() and when the 
>>>> View is sent. With JGroups 2.4, and the stack we are using at JBM, 
>>>> the view would aways be sent while channel.connect() is called 
>>>> (synchronized). It looks like this has changed on 2.6. (Probably 
>>>> another thread sending the view... or the Stack is different).
>>>>
>>>> On my tests I assumed this was because we didn't have the flush 
>>>> protocol, but i can see that Flush is being used at the Multiplexor 
>>>> channels.
>>>>
>>>> I am doing some debug right now, and I will update it as soon as we 
>>>> have solved this.
>>>>
>>>>
>>>> Clebert
>>>>
>>>> Adrian Brock wrote:
>>>>> I thought this had been fixed?
>>>>>
>>>>> I notice the NPE still exists in JBoss Messaging as well.
>>>>>
>>>>>
>>>>> 13:47:12,943 INFO  [STDOUT] 
>>>>> -------------------------------------------------------
>>>>> GMS: address is 127.0.0.1:32774
>>>>> -------------------------------------------------------
>>>>> 13:47:27,515 WARN  [MuxChannel] Flush failed at
>>>>> 127.0.0.1:32774:DefaultPartition-JMS-CTRL
>>>>> 13:47:29,519 WARN  [Multiplexer] failed to collect all service ACKs 
>>>>> (1)
>>>>> for [dst: <null>, src: 127.0.0.1:32774 (4 headers), size=0 bytes] 
>>>>> after
>>>>> 2000ms, missing ACKs from [127.0.0.1:32774] (received=[]),
>>>>> local_addr=127.0.0.1:32774
>>>>> 13:47:34,520 WARN  [JChannel] Timeout waiting for UNBLOCK event at
>>>>> 127.0.0.1:32774
>>>>> 13:47:34,525 ERROR [ExceptionUtil]
>>>>> org.jboss.messaging.core.jmx.MessagingPostOfficeService at 197e6dc
>>>>> startService
>>>>> java.lang.NullPointerException
>>>>>         at
>>>>> org.jboss.messaging.core.impl.postoffice.GroupMember.start(GroupMember.java:160) 
>>>>>
>>>>>         at
>>>>> org.jboss.messaging.core.impl.postoffice.MessagingPostOffice.start(MessagingPostOffice.java:347) 
>>>>>
>>>>>         at
>>>>> org.jboss.messaging.core.jmx.MessagingPostOfficeService.startService(MessagingPostOfficeService.java:427) 
>>>>>
>>>>>         at
>>>>> org.jboss.system.ServiceMBeanSupport.jbossInternalStart(ServiceMBeanSupport.java:299) 
>>>>>
>>>>>         at
>>>>> org.jboss.system.ServiceMBeanSupport.start(ServiceMBeanSupport.java:196) 
>>>>>
>>>>>
>>>> _______________________________________________
>>>> jboss-development mailing list
>>>> jboss-development at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/jboss-development
> 

-- 
Brian Stansberry
Lead, AS Clustering
JBoss, a division of Red Hat
brian.stansberry at redhat.com



More information about the jboss-development mailing list