[jboss-jira] [JBoss JIRA] (JGRP-2296) DNS_PING is dropping port values with SRV based service discovery
Eric Thompson (JIRA)
issues at jboss.org
Thu Sep 20 22:59:00 EDT 2018
[ https://issues.jboss.org/browse/JGRP-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eric Thompson updated JGRP-2296:
--------------------------------
Steps to Reproduce:
1. Set up a jboss/keycloak HA cluster using the Jgroups config below in AWS ECS with Service Discovery and dynamic port mapping
2. Set logging to DEBUG
3. You will see that the cluster never forms and the below port defaults are used (but they don't work).
was:
1. Set up a jboss/keycloak HA cluster using the Jgroups config above in AWS ECS with Service discovery and dynamic port mapping
2. Set logging to DEBUG
3. You will see that the cluster never forms and the above port defaults are used (but they don'g work)
> DNS_PING is dropping port values with SRV based service discovery
> -----------------------------------------------------------------
>
> Key: JGRP-2296
> URL: https://issues.jboss.org/browse/JGRP-2296
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 4.0.11
> Environment: JGroups version 4.0.11.Final
> Used in Keycloak 4.4.0
> Deployed as Jboss based Docker container from jboss/keycloak into AWS ECS
> Reporter: Eric Thompson
> Assignee: Bela Ban
> Priority: Blocker
>
> Using DNS_PING in Jgroups 4.0.11 and SRV records the port from the SRV record is being dropped (set to zero) and the default is used instead (7600).
> I am using this Jgroups config:
> {code}
> <subsystem xmlns="urn:jboss:domain:jgroups:6.0">
> <channels default="ee">
> <channel name="ee" stack="tcp" cluster="ejb"/>
> </channels>
> <stacks>
> <stack name="tcp">
> <transport type="TCP" socket-binding="jgroups-tcp">
> <property name="external_addr">${env.EXTERNAL_ADDR}</property>
> </transport>
> <protocol type="dns.DNS_PING">
> <property name="dns_query">
> jgroups.${env.DNS_NAME}.svc.cluster.local
> </property>
> <property name="dns_record_type">
> SRV
> </property>
> </protocol>
> <protocol type="MERGE3"/>
> <protocol type="FD_SOCK"/>
> <protocol type="FD_ALL"/>
> <protocol type="VERIFY_SUSPECT"/>
> <protocol type="pbcast.NAKACK2"/>
> <protocol type="UNICAST3"/>
> <protocol type="pbcast.STABLE"/>
> <protocol type="pbcast.GMS"/>
> <protocol type="MFC"/>
> <protocol type="FRAG3"/>
> </stack>
> </stacks>
> </subsystem>
> {code}
> I have these service discovery DNS entries
> {code}
> $ dig jgroups.dev.auth.sonatype.com.svc.cluster.local SRV
> ; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.68.rc1.58.amzn1 <<>> jgroups.dev.auth.sonatype.com.svc.cluster.local SRV
> ;; global options: +cmd
> ;; Got answer:
> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16690
> ;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 0
> ;; QUESTION SECTION:
> ;jgroups.dev.auth.sonatype.com.svc.cluster.local. IN SRV
> ;; ANSWER SECTION:
> jgroups.dev.auth.sonatype.com.svc.cluster.local. 10 IN SRV 1 1 32921 9ec82e3f-3a0e-4e30-b785-17879c63cd7d.jgroups.dev.auth.sonatype.com.svc.cluster.local.
> jgroups.dev.auth.sonatype.com.svc.cluster.local. 10 IN SRV 1 1 32923 60b5a820-9678-4bd2-84c6-00061a52bde0.jgroups.dev.auth.sonatype.com.svc.cluster.local.
> jgroups.dev.auth.sonatype.com.svc.cluster.local. 10 IN SRV 1 1 32915 9d9d78d0-8919-4b91-9df8-2e4e65afedae.jgroups.dev.auth.sonatype.com.svc.cluster.local.
> jgroups.dev.auth.sonatype.com.svc.cluster.local. 10 IN SRV 1 1 32917 161f3d66-f1e3-46f4-a44f-ebda925a25c6.jgroups.dev.auth.sonatype.com.svc.cluster.local.
> ;; Query time: 2 msec
> ;; SERVER: 10.42.3.2#53(10.42.3.2)
> ;; WHEN: Fri Sep 21 01:45:44 2018
> ;; MSG SIZE rcvd: 481
> {code}
> But I get this in the logs when running Keycloak in standalone cluster:
> {code}
> 17:45:10,121 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Performing initial discovery
> 17:45:10,154 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Entries collected from DNS: [10.42.3.56:0, 10.42.3.56:0, 10.42.3.44:0, 10.42.3.44:0]
> 17:45:10,155 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Discovered IP Address with port 0 (10.42.3.56:0). Replacing with default Transport port: 7600
> 17:45:10,159 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Discovered IP Address with port 0 (10.42.3.56:0). Replacing with default Transport port: 7600
> 17:45:10,159 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Discovered IP Address with port 0 (10.42.3.44:0). Replacing with default Transport port: 7600
> 17:45:10,159 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Discovered IP Address with port 0 (10.42.3.44:0). Replacing with default Transport port: 7600
> 17:45:10,159 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Performing discovery of the following hosts [10.42.3.56:7600, 10.42.3.44:7600, e200a617bf7a]
> 17:45:10,159 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) e200a617bf7a: sending discovery request to 10.42.3.56:7600
> 17:45:10,160 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) e200a617bf7a: sending discovery request to 10.42.3.44:7600
> 17:45:10,160 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-10,ejb,e200a617bf7a) Received discovery from: e200a617bf7a, IP: 10.42.3.44:7600
> 17:45:10,161 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) e200a617bf7a: sending discovery request to e200a617bf7a
> 17:45:10,162 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-11,ejb,e200a617bf7a) Received discovery from: e200a617bf7a, IP: 10.42.3.44:7600
> {code}
> As you can see it is resolving the DNS addresses, but discarding the ports.
> To be clear, in this example 32923 ids the port (eg:
> 1 1 32923 60b5a820-9678-4bd2-84c6-00061a52bde0.jgroups.dev.auth.sonatype.com.svc.cluster.local).
> These are dynamic ports mapped to port 7600 in order to put more Keycloak containers on each instance.
> {code}
> $ docker ps
> CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
> f67e39f8f403 datadog/agent:latest-jmx "/init" 8 hours ago Up 8 hours (healthy) 8125/udp, 8126/tcp ecs-auth-service-dev-26-datadog-agent-a2b7f783ddd0ba9cf601
> bbb12f0c43a5 233747045000.dkr.ecr.us-east-2.amazonaws.com/ops/keycloak:latest "/opt/jboss/tools/do…" 8 hours ago Up 8 hours 0.0.0.0:32923->7600/tcp, 0.0.0.0:32922->8080/tcp ecs-auth-service-dev-26-keycloak-f4bd8f8dca9fd4cd4f00
> 932cad7c4fb9 datadog/agent:latest-jmx "/init" 8 hours ago Up 8 hours (healthy) 8125/udp, 8126/tcp ecs-auth-service-dev-26-datadog-agent-baa38a98ccaddea6f501
> e200a617bf7a 233747045000.dkr.ecr.us-east-2.amazonaws.com/ops/keycloak:latest "/opt/jboss/tools/do…" 8 hours ago Up 8 hours 0.0.0.0:32921->7600/tcp, 0.0.0.0:32920->8080/tcp ecs-auth-service-dev-26-keycloak-e6f398e6cc8db5b5f101
> 73bc0b863c73 amazon/amazon-ecs-agent:latest "/agent" 2 days ago Up 2 days ecs-agent
> {code}
> This seems like it might be where ports are getting lost:
> https://github.com/belaban/JGroups/blob/07060c3ba6e52ad4aad3ac799c2bc95ffd2fe7ff/src/org/jgroups/protocols/dns/DefaultDNSResolver.java#L84
> Let me know if I am missing any details. This is a major blocker for development.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
More information about the jboss-jira
mailing list