[jboss-jira] [JBoss JIRA] Commented: (JGRP-680) receive_on_all_interfaces requires every NIC to be configured

Edward Kuns (JIRA) jira-events at lists.jboss.org
Wed Jul 16 20:23:52 EDT 2008


    [ https://jira.jboss.org/jira/browse/JGRP-680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12421616#action_12421616 ] 

Edward Kuns commented on JGRP-680:
----------------------------------

Unfortunately, I cannot use the receive_interfaces property without having to require undesirable configuration of a sort that is not entirely stable.  Much of the point of using JGroups is for this application to auto-discover other cluster members on the subnet, which means wanting to avoid having to configure which IP address or network devices should be listened to.

Some environments where this application has been deployed, it has been deployed on multi-NIC machines.  I don't want to outright ban deployment on multi-NIC machines.  However, if two dual-NIC servers have different NIC binding orders (which can occur for many reasons) and you don't tell JGroups to listen on all interfaces, you risk having cluster members not find one another, as they are talking on different subnets.  This is because JGroups defaults to listening or sending only on the very first non-loopback interface as enumerated by Java.  The order of interfaces in this list is not reliably controllable on a Windows server with more than one NIC.  Disabling and re-enabling NICs seems to be able to change the order in which NICs are enumerated by Java.  Does this change assignment of eth0, eth1, on Windows?  I'm not certain.  But it definitely changes which NIC JGroups will bind to if it is left to the default.

Thus, my current compromise of "listen on all interfaces but only send on the first interface found in the binding order."

I strongly want to avoid asking people to have to configure specific network interfaces by name or IP, and this shouldn't have to be necessary, IMO.  This is in part because where multiple servers are clustered together, they will usually have identical configurations.


> receive_on_all_interfaces requires every NIC to be configured
> -------------------------------------------------------------
>
>                 Key: JGRP-680
>                 URL: https://jira.jboss.org/jira/browse/JGRP-680
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 2.6.1
>         Environment: Windows Server 2003
>            Reporter: Edward Kuns
>            Assignee: Vladimir Blagojevic
>             Fix For: 2.7
>
>
> In UDP with receive_on_all_interfaces="true", if any interface is not configured -- say, the network cable is unplugged -- then JGroups will refuse to start by throwing an exception (IP addresses changed to protect the innocent):
> Caused by: org.jgroups.ChannelException: failed to start protocol stack
> 	at org.jgroups.JChannel.startStack(JChannel.java:1445)
> 	at org.jgroups.JChannel.connect(JChannel.java:356)
> 	at org.jgroups.blocks.NotificationBus.start(NotificationBus.java:126)
> 	at  the application making use of JGroups
> Caused by: java.lang.Exception: problem creating sockets (bind_addr=xxxxxxxx/11.22.33.44, mcast_addr=224.1.2.3:4444)
> 	at org.jgroups.protocols.UDP.start(UDP.java:363)
> 	at org.jgroups.stack.Configurator.startProtocolStack(Configurator.java:75)
> 	at org.jgroups.stack.ProtocolStack.startStack(ProtocolStack.java:301)
> 	at org.jgroups.JChannel.startStack(JChannel.java:1442)
> 	... 9 more
> Caused by: java.net.SocketException: bad argument for IP_MULTICAST_IF2: No IP addresses bound to interface
> 	at java.net.PlainDatagramSocketImpl.join(Ljava.net.InetAddress;Ljava.net.NetworkInterface;)V(Native Method)
> 	at java.net.PlainDatagramSocketImpl.joinGroup(PlainDatagramSocketImpl.java:196)
> 	at java.net.MulticastSocket.joinGroup(MulticastSocket.java:357)
> 	at org.jgroups.protocols.UDP.bindToInterfaces(UDP.java:525)
> 	at org.jgroups.protocols.UDP.createSockets(UDP.java:470)
> 	at org.jgroups.protocols.UDP.start(UDP.java:359)
> 	... 12 more
> Simply taking every NIC whose status is "network cable unplugged" and disabling that interface allows JGroups to start.
> This may be a feature request and not a bug report, but this seems unnecessarily strict.  With the setting of  receive_on_all_interfaces="true", if at least one interface comes up, then that should be enough for JGroups to function.
> In the application where I encountered this, I have both  receive_on_all_interfaces="true"   and   send_on_all_interfaces="false".  The reason is that if send_on_all_interfaces is true but JGroups fails to be able to send a message, there is no notification of this failure.  (That is, if the message cannot be sent on any interface at all.)  By sending on one interface by receiving on all, I appear to get the best of all worlds where a cluster of machines with multiple NICs should be able to communicate with one another no matter what the binding order of the NICs and no matter what order Java presents the interfaces.
> Except that receive_on_all_interfaces="true" requires EVERY NIC that is not disabled to be perfectly functioning or you cannot do anything at all.  Which means unplug one NIC cable and now the application cannot function at all, despite the fact that there's another network on which the clustered machines are all available.
> Preferred behavior:  If  ** at least one ** interface successfully opens a socket when receive_on_all_interfaces="true", then succeed.  If  ** all ** interfaces fail as shown above, then throw the exception shown above.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        



More information about the jboss-jira mailing list