[mod_cluster-issues] [JBoss JIRA] (MODCLUSTER-487) Default AdvertiseBindAddress value should not be NULL (UDP Multicast on Linux systems with more NICs)

Bogdan Sikora (JIRA) issues at jboss.org
Thu May 26 05:00:00 EDT 2016


    [ https://issues.jboss.org/browse/MODCLUSTER-487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243028#comment-13243028 ] 

Bogdan Sikora commented on MODCLUSTER-487:
------------------------------------------

There are warnings in eap log when using local-link (ffx2) multicast address 
{noformat}
[2016-05-24 08:55:34,518 WARN  [org.jboss.modcluster] (ServerService Thread Pool -- 64) MODCLUSTER000031: Could not bind multicast socket to /ff02:0:0:0:0:0:0:a (IPv6 address): Invalid argument; make sure your multicast address is of the same type as the IP stack (IPv4 or IPv6). Multicast socket will not be bound to an address, but this may lead to cross talking (see http://www.jboss.org/community/docs/DOC-9469 for details).
2016-05-24 08:55:41,790 WARN  [org.jgroups.protocols.UDP] (MSC service thread 1-8) could not bind to /ff02:0:0:0:0:0:0:15 (IPv6 address); make sure your mcast_addr is of the same type as the preferred IP stack (IPv4 or IPv6) by checking the value of the system properties java.net.preferIPv4Stack and java.net.preferIPv6Addresses.]
{noformat}

When changed multicast address of modcluster to site-local (ffx5).
Modcluster warning has disappear 
{noformat}
04:36:51,745 INFO  [org.jboss.modcluster] (ServerService Thread Pool -- 64) MODCLUSTER000001: Initializing mod_cluster version 1.3.2.Final-redhat-1
04:36:51,815 INFO  [org.jboss.modcluster] (ServerService Thread Pool -- 64) MODCLUSTER000032: Listening to proxy advertisements on /ff05:0:0:0:0:0:0:a:23364
04:36:52,328 INFO  [org.jboss.as.connector.subsystems.datasources] (MSC service thread 1-2) WFLYJCA0001: Bound data source [java:jboss/datasources/ExampleDS]
04:36:52,767 INFO  [org.jboss.as.server.deployment.scanner] (MSC service thread 1-1) WFLYDS0013: Started FileSystemDeploymentService for directory /mnt/hudson_workspace/mod_cluster-eap7/jboss-eap-7.0/standalone/deployments
04:36:52,806 INFO  [org.jboss.as.server.deployment] (MSC service thread 1-8) WFLYSRV0027: Starting deployment of "clusterbench.war" (runtime-name: "clusterbench.war")
04:36:53,565 INFO  [org.jboss.ws.common.management] (MSC service thread 1-5) JBWS022052: Starting JBossWS 5.1.3.SP1-redhat-1 (Apache CXF 3.1.4.redhat-1) 
04:36:53,654 WARN  [org.jboss.metadata.parser.jbossweb.JBossWebMetaDataParser] (MSC service thread 1-6) <replication-trigger/> is no longer supported and will be ignored
04:36:55,584 WARN  [org.jgroups.protocols.UDP] (MSC service thread 1-8) could not bind to /ff02:0:0:0:0:0:0:14 (IPv6 address); make sure your mcast_addr is of the same type as the preferred IP stack (IPv4 or IPv6) by checking the value of the system properties java.net.preferIPv4Stack and java.net.preferIPv6Addresses.
{noformat}




> Default AdvertiseBindAddress value should not be NULL (UDP Multicast on Linux systems with more NICs)
> -----------------------------------------------------------------------------------------------------
>
>                 Key: MODCLUSTER-487
>                 URL: https://issues.jboss.org/browse/MODCLUSTER-487
>             Project: mod_cluster
>          Issue Type: Bug
>          Components: Native (httpd modules)
>    Affects Versions: 1.2.11.Final, 1.3.1.Final
>         Environment: Linux, multiple NICs environment 
>            Reporter: Michal Karm Babacek
>            Assignee: Michal Karm Babacek
>            Priority: Critical
>         Attachments: advertise-linux3_x86_64.zip, advertise-windows_x86.zip, Advertize.class
>
>
> Credit where it's due: the issue was first spotted by [~rhatlapa].
> h3. Problem
> It appears that trying to send to all interfaces with {{NULL}} or {{"0.0.0.0"}} -- the default {{bindaddr}} when no {{AdvertiseBindAddress}} is set -- in the following statement actually picks the first non-loopback interface and sends to it.
> {code}
>     if ((rv = apr_sockaddr_info_get(&ma_listen_sa, bindaddr,
>                                     ma_mgroup_sa->family, bindport,
>                                     APR_UNSPEC, pool)) != APR_SUCCESS) {
>         ap_log_error(APLOG_MARK, APLOG_ERR, rv, s,
>                      "mod_advertise: ma_group_join apr_sockaddr_info_get(%s:%d) failed", bindaddr, bindport);
> {code}
> The result is that there is no datagram on other interfaces. Surprisingly, this is not deterministic though: After dozens or hundreds of messages, eventually one datagram reaches another interface.
> h3. Impact
> Picture this simple scenario: There are two interfaces, e.g. 
> {noformat}
> enp1s0 10.16.88.187
> enp2s0 172.18.0.1
> {noformat}
> listed in this exact order with {{ip addr show}}.
> One has an EAP 7 (Wildfly 10) instance with mod_cluster bound to {{172.18.0.1}} IP address, which implies {{enp2s0}} interface.
> Furthermore, one has an Apache HTTP Server instance with mod_cluster bound to {{172.18.0.1}} IP address, i.e. MCMP VirtualHost and main VirtualHost all Listen on this IP address.
> Result: Without advertising, using an explicit {{proxy-list}}, all is well. MCMP works, requests work, balancing works.
> On the other hand, relying on advertisement, it could take EAP 7 (Wildfly 10) *minutes* to register with the balancer.
> The reason is that a vast majority of UDP Multicast datagrams arrives at enp1s0 and EAP 7 (Wildfly 10) doesn't see them.
> h3. Reproducer
> Lemme demonstrate with a recently refactored [advertise.c|https://github.com/Karm/mod_cluster/tree/advertise-native-test/test/native/advertise] utility for sending datagrams and the well known [Advertize.java|https://raw.githubusercontent.com/modcluster/mod_cluster/master/test/java/Advertize.java] utility for receiving them.
> Your your convenience, here are binaries built from the aforementioned sources:
> * Advertize java utility: [^Advertize.class]
> * advertise native utility (Linux3 x86_64): [^advertise-linux3_x86_64.zip]
> * advertise native utility (WIndows x86): [^advertise-windows_x86.zip]
> h3. Demonstration on Linux
> h4. System
> {noformat}
> [mbabacek at perf09 ~]$ ip addr show
> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
>     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>     inet 127.0.0.1/8 scope host lo
>        valid_lft forever preferred_lft forever
> 2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
>     link/ether 00:18:8b:7a:46:04 brd ff:ff:ff:ff:ff:ff
>     inet 10.16.88.187/21 brd 10.16.95.255 scope global enp1s0
>        valid_lft forever preferred_lft forever
>     inet 10.16.93.253/21 brd 10.16.95.255 scope global secondary enp1s0
>        valid_lft forever preferred_lft forever
> 3: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
>     link/ether 00:18:8b:7a:46:05 brd ff:ff:ff:ff:ff:ff
>     inet 172.17.72.254/19 brd 172.17.95.255 scope global enp2s0
>        valid_lft forever preferred_lft forever
> 4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN 
>     link/ether 02:42:07:ab:74:f9 brd ff:ff:ff:ff:ff:ff
>     inet 172.18.0.1/16 scope global docker0
>        valid_lft forever preferred_lft forever
> {noformat}
> h4. Java
> {noformat}
> [mbabacek at perf09 ~]$ java -version
> openjdk version "1.8.0_71"
> OpenJDK Runtime Environment (build 1.8.0_71-b15)
> OpenJDK 64-Bit Server VM (build 25.71-b15, mixed mode)
> {noformat}
> h4.Advertise SENT
> {noformat}
> [mbabacek at perf09 ~]$ date;./advertise -a 224.0.1.102 -p 33364
> Mon Mar 21 12:39:51 EDT 2016
> UDP Multicast address to send datagrams to. Value: 224.0.1.102
> UDP Multicast port. Value: 33364
> IP address of the NIC to bound to. Value: NULL
> apr_socket_bind on 0.0.0.0:0
> apr_mcast_join on 0.0.0.0:0
> apr_socket_sendto to 224.0.1.102:33364
> {noformat}
> h4. Advertize RECEIVED
> YES (/)
> {noformat}
> [mbabacek at perf09 ~]$ java Advertize 224.0.1.102 33364
> Linux like OS
> ready waiting...
> received: Advertize !!! Mon, 21 Mar 2016 16:39:51 GMT
> received from /10.16.88.187:38907
> {noformat}
> YES (/)
> {noformat}
> [mbabacek at perf09 ~]$ java Advertize 224.0.1.102 33364 10.16.88.187
> Linux like OS
> ready waiting...
> received: Advertize !!! Mon, 21 Mar 2016 16:39:51 GMT
> received from /10.16.88.187:38907
> {noformat}
> NO (x)
> {noformat}
> [mbabacek at perf09 ~]$ java Advertize 224.0.1.102 33364 172.17.72.254
> Linux like OS
> ready waiting...
> {noformat}
> YES (/)
> {noformat}
> [mbabacek at perf09 ~]$ java Advertize 224.0.1.102 33364 0.0.0.0
> Linux like OS
> ready waiting...
> received: Advertize !!! Mon, 21 Mar 2016 16:39:51 GMT
> received from /10.16.88.187:38907
> {noformat}
> And now let's take a look at {{172.17.72.254}}, i.e. {{enp2s0}}
> h4. Advertise SENT
> {noformat}
> [mbabacek at perf09 ~]$ date;./advertise -a 224.0.1.102 -p 33364 -n 172.17.72.254
> Mon Mar 21 12:42:57 EDT 2016
> UDP Multicast address to send datagrams to. Value: 224.0.1.102
> UDP Multicast port. Value: 33364
> IP address of the NIC to bound to. Value: 172.17.72.254
> apr_socket_bind on 172.17.72.254:0
> apr_mcast_join on 172.17.72.254:0
> apr_socket_sendto to 224.0.1.102:33364
> {noformat}
> h4. Advertize RECEIVED
> NO (x)
> {noformat}
> [mbabacek at perf09 ~]$ java Advertize 224.0.1.102 33364
> Linux like OS
> ready waiting...
> {noformat}
> NO (x)
> {noformat}
> [mbabacek at perf09 ~]$ java Advertize 224.0.1.102 33364 10.16.88.187
> Linux like OS
> ready waiting...
> {noformat}
> YES (/)
> {noformat}
> [mbabacek at perf09 ~]$ java Advertize 224.0.1.102 33364 172.17.72.254
> Linux like OS
> ready waiting...
> received: Advertize !!! Mon, 21 Mar 2016 16:42:57 GMT
> received from /172.17.72.254:35452
> {noformat}
> NO (x)
> {noformat}
> [mbabacek at perf09 ~]$ java Advertize 224.0.1.102 33364 0.0.0.0
> Linux like OS
> ready waiting...
> {noformat}
> h3. Demonstration on Windows
> One could note that the problem doesn't exist on Windows. All interfaces receive advertising.
> h4. Advertise SENT
> {noformat}
> C:\Users\karm\advertise-build
> λ advertise.exe -a 224.0.1.102 -p 33364
> UDP Multicast address to send datagrams to. Value: 224.0.1.102
> UDP Multicast port. Value: 33364
> IP address of the NIC to bound to. Value: NULL
> apr_socket_bind on 0.0.0.0:0
> apr_mcast_join on 0.0.0.0:0
> apr_socket_sendto to 224.0.1.102:33364
> {noformat}
> h4. Advertize RECEIVED
> YES (/)
> {noformat}
> C:\Users\karm\WORKSPACE
> λ "C:\Program Files\Java\jdk1.8.0_74\bin\java" Advertize 224.0.1.102 33364
> ready waiting...
> received: Advertize !!! Mon, 21 Mar 2016 18:07:50 GMT
> received from /192.168.122.52:61805
> {noformat}
> YES (/)
> {noformat}
> C:\Users\karm\WORKSPACE
> λ "C:\Program Files\Java\jdk1.8.0_74\bin\java" Advertize 224.0.1.102 33364 192.168.122.52
> ready waiting...
> received: Advertize !!! Mon, 21 Mar 2016 18:07:50 GMT
> received from /192.168.122.52:61805
> {noformat}
> YES (/)
> {noformat}
> C:\Users\karm\WORKSPACE
> λ "C:\Program Files\Java\jdk1.8.0_74\bin\java" Advertize 224.0.1.102 33364 192.168.122.199
> ready waiting...
> received: Advertize !!! Mon, 21 Mar 2016 18:07:50 GMT
> received from /192.168.122.52:61805
> {noformat}
> h4. Advertise SENT
> {noformat}
> C:\Users\karm\advertise-build
> λ advertise.exe -a 224.0.1.102 -p 33364 -n 192.168.122.199
> UDP Multicast address to send datagrams to. Value: 224.0.1.102
> UDP Multicast port. Value: 33364
> IP address of the NIC to bound to. Value: 192.168.122.199
> apr_socket_bind on 192.168.122.199:0
> apr_mcast_join on 192.168.122.199:0
> apr_socket_sendto to 224.0.1.102:33364
> {noformat}
> h4. Advertize RECEIVED
> YES (/)
> {noformat}
> C:\Users\karm\WORKSPACE
> λ "C:\Program Files\Java\jdk1.8.0_74\bin\java" Advertize 224.0.1.102 33364
> ready waiting...
> received: Advertize !!! Mon, 21 Mar 2016 18:09:55 GMT
> received from /192.168.122.199:52781
> {noformat}
> YES (/)
> {noformat}
> C:\Users\karm\WORKSPACE
> λ "C:\Program Files\Java\jdk1.8.0_74\bin\java" Advertize 224.0.1.102 33364 192.168.122.52
> ready waiting...
> received: Advertize !!! Mon, 21 Mar 2016 18:09:55 GMT
> received from /192.168.122.199:52781
> {noformat}
> YES (/)
> {noformat}
> C:\Users\karm\WORKSPACE
> λ "C:\Program Files\Java\jdk1.8.0_74\bin\java" Advertize 224.0.1.102 33364 192.168.122.199
> ready waiting...
> received: Advertize !!! Mon, 21 Mar 2016 18:09:55 GMT
> received from /192.168.122.199:52781
> {noformat}
> h3. Suggestion
> Ideas? :) [~jfclere], [~rhusar] 
> I suggest setting {{bindaddr}} (AdvertiseBindAddress) default to main_server's address or MCMP enabled vhost instead of NULL. I'll post a PR for evaluation.  



--
This message was sent by Atlassian JIRA
(v6.4.11#64026)



More information about the mod_cluster-issues mailing list