[
https://jira.jboss.org/jira/browse/JGRP-680?page=com.atlassian.jira.plugi...
]
Edward Kuns commented on JGRP-680:
----------------------------------
Unfortunately, I cannot use the receive_interfaces property without having to require
undesirable configuration of a sort that is not entirely stable. Much of the point of
using JGroups is for this application to auto-discover other cluster members on the
subnet, which means wanting to avoid having to configure which IP address or network
devices should be listened to.
Some environments where this application has been deployed, it has been deployed on
multi-NIC machines. I don't want to outright ban deployment on multi-NIC machines.
However, if two dual-NIC servers have different NIC binding orders (which can occur for
many reasons) and you don't tell JGroups to listen on all interfaces, you risk having
cluster members not find one another, as they are talking on different subnets. This is
because JGroups defaults to listening or sending only on the very first non-loopback
interface as enumerated by Java. The order of interfaces in this list is not reliably
controllable on a Windows server with more than one NIC. Disabling and re-enabling NICs
seems to be able to change the order in which NICs are enumerated by Java. Does this
change assignment of eth0, eth1, on Windows? I'm not certain. But it definitely
changes which NIC JGroups will bind to if it is left to the default.
Thus, my current compromise of "listen on all interfaces but only send on the first
interface found in the binding order."
I strongly want to avoid asking people to have to configure specific network interfaces by
name or IP, and this shouldn't have to be necessary, IMO. This is in part because
where multiple servers are clustered together, they will usually have identical
configurations.
receive_on_all_interfaces requires every NIC to be configured
-------------------------------------------------------------
Key: JGRP-680
URL:
https://jira.jboss.org/jira/browse/JGRP-680
Project: JGroups
Issue Type: Bug
Affects Versions: 2.6.1
Environment: Windows Server 2003
Reporter: Edward Kuns
Assignee: Vladimir Blagojevic
Fix For: 2.7
In UDP with receive_on_all_interfaces="true", if any interface is not
configured -- say, the network cable is unplugged -- then JGroups will refuse to start by
throwing an exception (IP addresses changed to protect the innocent):
Caused by: org.jgroups.ChannelException: failed to start protocol stack
at org.jgroups.JChannel.startStack(JChannel.java:1445)
at org.jgroups.JChannel.connect(JChannel.java:356)
at org.jgroups.blocks.NotificationBus.start(NotificationBus.java:126)
at the application making use of JGroups
Caused by: java.lang.Exception: problem creating sockets (bind_addr=xxxxxxxx/11.22.33.44,
mcast_addr=224.1.2.3:4444)
at org.jgroups.protocols.UDP.start(UDP.java:363)
at org.jgroups.stack.Configurator.startProtocolStack(Configurator.java:75)
at org.jgroups.stack.ProtocolStack.startStack(ProtocolStack.java:301)
at org.jgroups.JChannel.startStack(JChannel.java:1442)
... 9 more
Caused by: java.net.SocketException: bad argument for IP_MULTICAST_IF2: No IP addresses
bound to interface
at
java.net.PlainDatagramSocketImpl.join(Ljava.net.InetAddress;Ljava.net.NetworkInterface;)V(Native
Method)
at java.net.PlainDatagramSocketImpl.joinGroup(PlainDatagramSocketImpl.java:196)
at java.net.MulticastSocket.joinGroup(MulticastSocket.java:357)
at org.jgroups.protocols.UDP.bindToInterfaces(UDP.java:525)
at org.jgroups.protocols.UDP.createSockets(UDP.java:470)
at org.jgroups.protocols.UDP.start(UDP.java:359)
... 12 more
Simply taking every NIC whose status is "network cable unplugged" and disabling
that interface allows JGroups to start.
This may be a feature request and not a bug report, but this seems unnecessarily strict.
With the setting of receive_on_all_interfaces="true", if at least one interface
comes up, then that should be enough for JGroups to function.
In the application where I encountered this, I have both
receive_on_all_interfaces="true" and
send_on_all_interfaces="false". The reason is that if send_on_all_interfaces is
true but JGroups fails to be able to send a message, there is no notification of this
failure. (That is, if the message cannot be sent on any interface at all.) By sending on
one interface by receiving on all, I appear to get the best of all worlds where a cluster
of machines with multiple NICs should be able to communicate with one another no matter
what the binding order of the NICs and no matter what order Java presents the interfaces.
Except that receive_on_all_interfaces="true" requires EVERY NIC that is not
disabled to be perfectly functioning or you cannot do anything at all. Which means unplug
one NIC cable and now the application cannot function at all, despite the fact that
there's another network on which the clustered machines are all available.
Preferred behavior: If ** at least one ** interface successfully opens a socket when
receive_on_all_interfaces="true", then succeed. If ** all ** interfaces fail
as shown above, then throw the exception shown above.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira