]
Bela Ban closed JGRP-2363.
--------------------------
Resolution: Out of Date
DNS Ping cannot lookup SRV record for service port
--------------------------------------------------
Key: JGRP-2363
URL:
https://issues.redhat.com/browse/JGRP-2363
Project: JGroups
Issue Type: Bug
Affects Versions: 4.0.20
Reporter: Howard Gao
Assignee: Bela Ban
Priority: Major
Attachments: App2.java
I've got a problem regarding getting service port in DNS_PING DNS lookup.
It seems in my openshift environment the JNDI DNS lookup cannot query the
correct SRV record from the openshift DNS server. Ref:
https://github.com/jboss-openshift/openshift-ping/blob/1.2.1.Final/dns/sr...
For example, here is the ping service:
apiVersion: v1
kind: Service
metadata:
annotations:
description: The JGroups ping port for clustering.
service.alpha.kubernetes.io/tolerate-unready-endpoints: 'true'
labels:
application: application0
template: amq-broker-73-persistence-clustered
xpaas: 1.4.16
name: application0-ping
spec:
clusterIP: None
publishNotReadyAddresses: true
ports:
port: 8888
protocol: TCP
name: jgroup-port
targetPort: 8888
selector:
deploymentConfig: application0-amq
After it is deployed I deployed a application pod
with JGroups DNS_PING protocol loaded. The relevant
jgroups xml part looks like this:
<config> ... <openshift.DNS_PING timeout="3000"
serviceName="application0-ping" /> ... </config>
After my application pod is in running state, I checked the log
and there is a warning message from DNS_PING:
2019-07-22 04:16:59,600 INFO [org.openshift.ping.common.Utils] 3 attempt(s) with a 1000ms
sleep to execute [GetServicePort] failed. Last failure was
[java.lang.NullPointerException: null]
2019-07-22 04:16:59,601 WARNING [org.jgroups.protocols.openshift.DNS_PING] No DNS SRV
record found for service [application0-ping]
After some debugging it turns out that the DNS lookup for the record by this name
"_tcp.application0-ping" returned null.
However if I logged into the application pod and do nslookup it will give me correct
record:
sh-5.0# nslookup -type=srv _tcp.application0-ping
Server: 10.74.177.77
Address: 10.74.177.77#53
_tcp.application0-ping.default.svc.cluster.local service = 10 100 8888
44c84e52.application0-ping.default.svc.cluster.local.
And you can get the full name from the record, which is
_tcp.application0-ping.default.svc.cluster.local
If I then pass the full qualified name into the application and it can query the SRV
record successfully.
I have no idea why my application can't query the record using the short form name
(i.e. _tcp.application0-ping). Could it be some configuration issue for the DNS ping?
My openshift env details are:
oc v3.11.117
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO
and the java version used in pod:
sh-5.0# java -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (build 1.8.0_212-b04)
OpenJDK 64-Bit Server VM (build 25.212-b04, mixed mode)
and the base OS is fedora 30.