[jboss-jira] [JBoss JIRA] (JGRP-2296) DNS_PING is dropping port values with SRV based service discovery
Eric Thompson (JIRA)
issues at jboss.org
Mon Sep 24 08:50:00 EDT 2018
[ https://issues.jboss.org/browse/JGRP-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eric Thompson updated JGRP-2296:
--------------------------------
Description:
Using DNS_PING in Jgroups 4.0.11 and SRV records the port from the SRV record is being dropped (set to zero) and the default is used instead (7600).
I am using this Jgroups config:
{code}
<subsystem xmlns="urn:jboss:domain:jgroups:6.0">
<channels default="ee">
<channel name="ee" stack="tcp" cluster="ejb"/>
</channels>
<stacks>
<stack name="tcp">
<transport type="TCP" socket-binding="jgroups-tcp">
<property name="external_addr">${env.EXTERNAL_ADDR}</property>
</transport>
<protocol type="dns.DNS_PING">
<property name="dns_query">
jgroups.${env.DNS_NAME}.svc.cluster.local
</property>
<property name="dns_record_type">
SRV
</property>
</protocol>
<protocol type="MERGE3"/>
<protocol type="FD_SOCK"/>
<protocol type="FD_ALL"/>
<protocol type="VERIFY_SUSPECT"/>
<protocol type="pbcast.NAKACK2"/>
<protocol type="UNICAST3"/>
<protocol type="pbcast.STABLE"/>
<protocol type="pbcast.GMS"/>
<protocol type="MFC"/>
<protocol type="FRAG3"/>
</stack>
</stacks>
</subsystem>
{code}
I have these service discovery DNS entries
{code}
$ dig jgroups.dev.auth.example.com.svc.cluster.local SRV
; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.68.rc1.58.amzn1 <<>> jgroups.dev.auth.example.com.svc.cluster.local SRV
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16690
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;jgroups.dev.auth.example.com.svc.cluster.local. IN SRV
;; ANSWER SECTION:
jgroups.dev.auth.example.com.svc.cluster.local. 10 IN SRV 1 1 32921 9ec82e3f-3a0e-4e30-b785-17879c63cd7d.jgroups.dev.auth.example.com.svc.cluster.local.
jgroups.dev.auth.example.com.svc.cluster.local. 10 IN SRV 1 1 32923 60b5a820-9678-4bd2-84c6-00061a52bde0.jgroups.dev.auth.example.com.svc.cluster.local.
jgroups.dev.auth.example.com.svc.cluster.local. 10 IN SRV 1 1 32915 9d9d78d0-8919-4b91-9df8-2e4e65afedae.jgroups.dev.auth.example.com.svc.cluster.local.
jgroups.dev.auth.example.com.svc.cluster.local. 10 IN SRV 1 1 32917 161f3d66-f1e3-46f4-a44f-ebda925a25c6.jgroups.dev.auth.example.com.svc.cluster.local.
;; Query time: 2 msec
;; SERVER: 10.42.3.2#53(10.42.3.2)
;; WHEN: Fri Sep 21 01:45:44 2018
;; MSG SIZE rcvd: 481
{code}
But I get this in the logs when running Keycloak in standalone cluster:
{code}
17:45:10,121 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Performing initial discovery
17:45:10,154 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Entries collected from DNS: [10.42.3.56:0, 10.42.3.56:0, 10.42.3.44:0, 10.42.3.44:0]
17:45:10,155 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Discovered IP Address with port 0 (10.42.3.56:0). Replacing with default Transport port: 7600
17:45:10,159 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Discovered IP Address with port 0 (10.42.3.56:0). Replacing with default Transport port: 7600
17:45:10,159 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Discovered IP Address with port 0 (10.42.3.44:0). Replacing with default Transport port: 7600
17:45:10,159 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Discovered IP Address with port 0 (10.42.3.44:0). Replacing with default Transport port: 7600
17:45:10,159 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Performing discovery of the following hosts [10.42.3.56:7600, 10.42.3.44:7600, e200a617bf7a]
17:45:10,159 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) e200a617bf7a: sending discovery request to 10.42.3.56:7600
17:45:10,160 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) e200a617bf7a: sending discovery request to 10.42.3.44:7600
17:45:10,160 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-10,ejb,e200a617bf7a) Received discovery from: e200a617bf7a, IP: 10.42.3.44:7600
17:45:10,161 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) e200a617bf7a: sending discovery request to e200a617bf7a
17:45:10,162 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-11,ejb,e200a617bf7a) Received discovery from: e200a617bf7a, IP: 10.42.3.44:7600
{code}
As you can see it is resolving the DNS addresses, but discarding the ports.
To be clear, in this example 32923 ids the port (eg:
1 1 32923 60b5a820-9678-4bd2-84c6-00061a52bde0.jgroups.dev.auth.example.com.svc.cluster.local).
These are dynamic ports mapped to port 7600 in order to put more Keycloak containers on each instance.
{code}
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f67e39f8f403 datadog/agent:latest-jmx "/init" 8 hours ago Up 8 hours (healthy) 8125/udp, 8126/tcp ecs-auth-service-dev-26-datadog-agent-a2b7f783ddd0ba9cf601
bbb12f0c43a5 233747045000.dkr.ecr.us-east-2.amazonaws.com/ops/keycloak:latest "/opt/jboss/tools/do…" 8 hours ago Up 8 hours 0.0.0.0:32923->7600/tcp, 0.0.0.0:32922->8080/tcp ecs-auth-service-dev-26-keycloak-f4bd8f8dca9fd4cd4f00
932cad7c4fb9 datadog/agent:latest-jmx "/init" 8 hours ago Up 8 hours (healthy) 8125/udp, 8126/tcp ecs-auth-service-dev-26-datadog-agent-baa38a98ccaddea6f501
e200a617bf7a 233747045000.dkr.ecr.us-east-2.amazonaws.com/ops/keycloak:latest "/opt/jboss/tools/do…" 8 hours ago Up 8 hours 0.0.0.0:32921->7600/tcp, 0.0.0.0:32920->8080/tcp ecs-auth-service-dev-26-keycloak-e6f398e6cc8db5b5f101
73bc0b863c73 amazon/amazon-ecs-agent:latest "/agent" 2 days ago Up 2 days ecs-agent
{code}
This seems like it might be where ports are getting lost:
https://github.com/belaban/JGroups/blob/07060c3ba6e52ad4aad3ac799c2bc95ffd2fe7ff/src/org/jgroups/protocols/dns/DefaultDNSResolver.java#L84
I don't see the port number being extracted from the SRV entry and appended to the IP returned from resolveAEntries.
Let me know if I am missing any details. This is a major blocker for development.
was:
Using DNS_PING in Jgroups 4.0.11 and SRV records the port from the SRV record is being dropped (set to zero) and the default is used instead (7600).
I am using this Jgroups config:
{code}
<subsystem xmlns="urn:jboss:domain:jgroups:6.0">
<channels default="ee">
<channel name="ee" stack="tcp" cluster="ejb"/>
</channels>
<stacks>
<stack name="tcp">
<transport type="TCP" socket-binding="jgroups-tcp">
<property name="external_addr">${env.EXTERNAL_ADDR}</property>
</transport>
<protocol type="dns.DNS_PING">
<property name="dns_query">
jgroups.${env.DNS_NAME}.svc.cluster.local
</property>
<property name="dns_record_type">
SRV
</property>
</protocol>
<protocol type="MERGE3"/>
<protocol type="FD_SOCK"/>
<protocol type="FD_ALL"/>
<protocol type="VERIFY_SUSPECT"/>
<protocol type="pbcast.NAKACK2"/>
<protocol type="UNICAST3"/>
<protocol type="pbcast.STABLE"/>
<protocol type="pbcast.GMS"/>
<protocol type="MFC"/>
<protocol type="FRAG3"/>
</stack>
</stacks>
</subsystem>
{code}
I have these service discovery DNS entries
{code}
$ dig jgroups.dev.auth.sonatype.com.svc.cluster.local SRV
; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.68.rc1.58.amzn1 <<>> jgroups.dev.auth.sonatype.com.svc.cluster.local SRV
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16690
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;jgroups.dev.auth.sonatype.com.svc.cluster.local. IN SRV
;; ANSWER SECTION:
jgroups.dev.auth.sonatype.com.svc.cluster.local. 10 IN SRV 1 1 32921 9ec82e3f-3a0e-4e30-b785-17879c63cd7d.jgroups.dev.auth.sonatype.com.svc.cluster.local.
jgroups.dev.auth.sonatype.com.svc.cluster.local. 10 IN SRV 1 1 32923 60b5a820-9678-4bd2-84c6-00061a52bde0.jgroups.dev.auth.sonatype.com.svc.cluster.local.
jgroups.dev.auth.sonatype.com.svc.cluster.local. 10 IN SRV 1 1 32915 9d9d78d0-8919-4b91-9df8-2e4e65afedae.jgroups.dev.auth.sonatype.com.svc.cluster.local.
jgroups.dev.auth.sonatype.com.svc.cluster.local. 10 IN SRV 1 1 32917 161f3d66-f1e3-46f4-a44f-ebda925a25c6.jgroups.dev.auth.sonatype.com.svc.cluster.local.
;; Query time: 2 msec
;; SERVER: 10.42.3.2#53(10.42.3.2)
;; WHEN: Fri Sep 21 01:45:44 2018
;; MSG SIZE rcvd: 481
{code}
But I get this in the logs when running Keycloak in standalone cluster:
{code}
17:45:10,121 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Performing initial discovery
17:45:10,154 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Entries collected from DNS: [10.42.3.56:0, 10.42.3.56:0, 10.42.3.44:0, 10.42.3.44:0]
17:45:10,155 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Discovered IP Address with port 0 (10.42.3.56:0). Replacing with default Transport port: 7600
17:45:10,159 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Discovered IP Address with port 0 (10.42.3.56:0). Replacing with default Transport port: 7600
17:45:10,159 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Discovered IP Address with port 0 (10.42.3.44:0). Replacing with default Transport port: 7600
17:45:10,159 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Discovered IP Address with port 0 (10.42.3.44:0). Replacing with default Transport port: 7600
17:45:10,159 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Performing discovery of the following hosts [10.42.3.56:7600, 10.42.3.44:7600, e200a617bf7a]
17:45:10,159 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) e200a617bf7a: sending discovery request to 10.42.3.56:7600
17:45:10,160 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) e200a617bf7a: sending discovery request to 10.42.3.44:7600
17:45:10,160 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-10,ejb,e200a617bf7a) Received discovery from: e200a617bf7a, IP: 10.42.3.44:7600
17:45:10,161 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) e200a617bf7a: sending discovery request to e200a617bf7a
17:45:10,162 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-11,ejb,e200a617bf7a) Received discovery from: e200a617bf7a, IP: 10.42.3.44:7600
{code}
As you can see it is resolving the DNS addresses, but discarding the ports.
To be clear, in this example 32923 ids the port (eg:
1 1 32923 60b5a820-9678-4bd2-84c6-00061a52bde0.jgroups.dev.auth.sonatype.com.svc.cluster.local).
These are dynamic ports mapped to port 7600 in order to put more Keycloak containers on each instance.
{code}
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f67e39f8f403 datadog/agent:latest-jmx "/init" 8 hours ago Up 8 hours (healthy) 8125/udp, 8126/tcp ecs-auth-service-dev-26-datadog-agent-a2b7f783ddd0ba9cf601
bbb12f0c43a5 233747045000.dkr.ecr.us-east-2.amazonaws.com/ops/keycloak:latest "/opt/jboss/tools/do…" 8 hours ago Up 8 hours 0.0.0.0:32923->7600/tcp, 0.0.0.0:32922->8080/tcp ecs-auth-service-dev-26-keycloak-f4bd8f8dca9fd4cd4f00
932cad7c4fb9 datadog/agent:latest-jmx "/init" 8 hours ago Up 8 hours (healthy) 8125/udp, 8126/tcp ecs-auth-service-dev-26-datadog-agent-baa38a98ccaddea6f501
e200a617bf7a 233747045000.dkr.ecr.us-east-2.amazonaws.com/ops/keycloak:latest "/opt/jboss/tools/do…" 8 hours ago Up 8 hours 0.0.0.0:32921->7600/tcp, 0.0.0.0:32920->8080/tcp ecs-auth-service-dev-26-keycloak-e6f398e6cc8db5b5f101
73bc0b863c73 amazon/amazon-ecs-agent:latest "/agent" 2 days ago Up 2 days ecs-agent
{code}
This seems like it might be where ports are getting lost:
https://github.com/belaban/JGroups/blob/07060c3ba6e52ad4aad3ac799c2bc95ffd2fe7ff/src/org/jgroups/protocols/dns/DefaultDNSResolver.java#L84
I don't see the port number being extracted from the SRV entry and appended to the IP returned from resolveAEntries.
Let me know if I am missing any details. This is a major blocker for development.
> DNS_PING is dropping port values with SRV based service discovery
> -----------------------------------------------------------------
>
> Key: JGRP-2296
> URL: https://issues.jboss.org/browse/JGRP-2296
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 4.0.11
> Environment: JGroups version 4.0.11.Final
> Used in Keycloak 4.4.0
> Deployed as Jboss based Docker container from jboss/keycloak into AWS ECS
> Reporter: Eric Thompson
> Assignee: Bela Ban
> Priority: Blocker
> Fix For: 4.0.15
>
>
> Using DNS_PING in Jgroups 4.0.11 and SRV records the port from the SRV record is being dropped (set to zero) and the default is used instead (7600).
> I am using this Jgroups config:
> {code}
> <subsystem xmlns="urn:jboss:domain:jgroups:6.0">
> <channels default="ee">
> <channel name="ee" stack="tcp" cluster="ejb"/>
> </channels>
> <stacks>
> <stack name="tcp">
> <transport type="TCP" socket-binding="jgroups-tcp">
> <property name="external_addr">${env.EXTERNAL_ADDR}</property>
> </transport>
> <protocol type="dns.DNS_PING">
> <property name="dns_query">
> jgroups.${env.DNS_NAME}.svc.cluster.local
> </property>
> <property name="dns_record_type">
> SRV
> </property>
> </protocol>
> <protocol type="MERGE3"/>
> <protocol type="FD_SOCK"/>
> <protocol type="FD_ALL"/>
> <protocol type="VERIFY_SUSPECT"/>
> <protocol type="pbcast.NAKACK2"/>
> <protocol type="UNICAST3"/>
> <protocol type="pbcast.STABLE"/>
> <protocol type="pbcast.GMS"/>
> <protocol type="MFC"/>
> <protocol type="FRAG3"/>
> </stack>
> </stacks>
> </subsystem>
> {code}
> I have these service discovery DNS entries
> {code}
> $ dig jgroups.dev.auth.example.com.svc.cluster.local SRV
> ; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.68.rc1.58.amzn1 <<>> jgroups.dev.auth.example.com.svc.cluster.local SRV
> ;; global options: +cmd
> ;; Got answer:
> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16690
> ;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 0
> ;; QUESTION SECTION:
> ;jgroups.dev.auth.example.com.svc.cluster.local. IN SRV
> ;; ANSWER SECTION:
> jgroups.dev.auth.example.com.svc.cluster.local. 10 IN SRV 1 1 32921 9ec82e3f-3a0e-4e30-b785-17879c63cd7d.jgroups.dev.auth.example.com.svc.cluster.local.
> jgroups.dev.auth.example.com.svc.cluster.local. 10 IN SRV 1 1 32923 60b5a820-9678-4bd2-84c6-00061a52bde0.jgroups.dev.auth.example.com.svc.cluster.local.
> jgroups.dev.auth.example.com.svc.cluster.local. 10 IN SRV 1 1 32915 9d9d78d0-8919-4b91-9df8-2e4e65afedae.jgroups.dev.auth.example.com.svc.cluster.local.
> jgroups.dev.auth.example.com.svc.cluster.local. 10 IN SRV 1 1 32917 161f3d66-f1e3-46f4-a44f-ebda925a25c6.jgroups.dev.auth.example.com.svc.cluster.local.
> ;; Query time: 2 msec
> ;; SERVER: 10.42.3.2#53(10.42.3.2)
> ;; WHEN: Fri Sep 21 01:45:44 2018
> ;; MSG SIZE rcvd: 481
> {code}
> But I get this in the logs when running Keycloak in standalone cluster:
> {code}
> 17:45:10,121 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Performing initial discovery
> 17:45:10,154 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Entries collected from DNS: [10.42.3.56:0, 10.42.3.56:0, 10.42.3.44:0, 10.42.3.44:0]
> 17:45:10,155 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Discovered IP Address with port 0 (10.42.3.56:0). Replacing with default Transport port: 7600
> 17:45:10,159 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Discovered IP Address with port 0 (10.42.3.56:0). Replacing with default Transport port: 7600
> 17:45:10,159 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Discovered IP Address with port 0 (10.42.3.44:0). Replacing with default Transport port: 7600
> 17:45:10,159 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Discovered IP Address with port 0 (10.42.3.44:0). Replacing with default Transport port: 7600
> 17:45:10,159 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) Performing discovery of the following hosts [10.42.3.56:7600, 10.42.3.44:7600, e200a617bf7a]
> 17:45:10,159 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) e200a617bf7a: sending discovery request to 10.42.3.56:7600
> 17:45:10,160 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) e200a617bf7a: sending discovery request to 10.42.3.44:7600
> 17:45:10,160 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-10,ejb,e200a617bf7a) Received discovery from: e200a617bf7a, IP: 10.42.3.44:7600
> 17:45:10,161 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-3,null,null) e200a617bf7a: sending discovery request to e200a617bf7a
> 17:45:10,162 DEBUG [org.jgroups.protocols.dns.DNS_PING] (thread-11,ejb,e200a617bf7a) Received discovery from: e200a617bf7a, IP: 10.42.3.44:7600
> {code}
> As you can see it is resolving the DNS addresses, but discarding the ports.
> To be clear, in this example 32923 ids the port (eg:
> 1 1 32923 60b5a820-9678-4bd2-84c6-00061a52bde0.jgroups.dev.auth.example.com.svc.cluster.local).
> These are dynamic ports mapped to port 7600 in order to put more Keycloak containers on each instance.
> {code}
> $ docker ps
> CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
> f67e39f8f403 datadog/agent:latest-jmx "/init" 8 hours ago Up 8 hours (healthy) 8125/udp, 8126/tcp ecs-auth-service-dev-26-datadog-agent-a2b7f783ddd0ba9cf601
> bbb12f0c43a5 233747045000.dkr.ecr.us-east-2.amazonaws.com/ops/keycloak:latest "/opt/jboss/tools/do…" 8 hours ago Up 8 hours 0.0.0.0:32923->7600/tcp, 0.0.0.0:32922->8080/tcp ecs-auth-service-dev-26-keycloak-f4bd8f8dca9fd4cd4f00
> 932cad7c4fb9 datadog/agent:latest-jmx "/init" 8 hours ago Up 8 hours (healthy) 8125/udp, 8126/tcp ecs-auth-service-dev-26-datadog-agent-baa38a98ccaddea6f501
> e200a617bf7a 233747045000.dkr.ecr.us-east-2.amazonaws.com/ops/keycloak:latest "/opt/jboss/tools/do…" 8 hours ago Up 8 hours 0.0.0.0:32921->7600/tcp, 0.0.0.0:32920->8080/tcp ecs-auth-service-dev-26-keycloak-e6f398e6cc8db5b5f101
> 73bc0b863c73 amazon/amazon-ecs-agent:latest "/agent" 2 days ago Up 2 days ecs-agent
> {code}
> This seems like it might be where ports are getting lost:
> https://github.com/belaban/JGroups/blob/07060c3ba6e52ad4aad3ac799c2bc95ffd2fe7ff/src/org/jgroups/protocols/dns/DefaultDNSResolver.java#L84
> I don't see the port number being extracted from the SRV entry and appended to the IP returned from resolveAEntries.
> Let me know if I am missing any details. This is a major blocker for development.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
More information about the jboss-jira
mailing list