[keycloak-user] Keycloak in HA mode on Kubernetes fails with "invalid_code" when requesting tokens

Jody H j9dy1g at gmail.com
Thu Apr 25 05:58:15 EDT 2019


Hi,
the Helm chart has not yet switched to KUBE_PING, it is using DNS_PING with
a headless service in Kubernetes.
A snippet from the statefulset.yaml:
- name: JGROUPS_DISCOVERY_PROTOCOL
value: "dns.DNS_PING"
- name: JGROUPS_DISCOVERY_PROPERTIES
value: "dns_query={{ template "keycloak.fullname" . }}-headless.{{
.Release.Namespace }}.svc.{{ .Values.clusterDomain }}"


In the logs, here of node keycloak-0, it looks like the nodes join the
cluster successfully?
15:23:52,017 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-1) ISPN000078: Starting JGroups channel ejb
15:23:52,017 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-2) ISPN000078: Starting JGroups channel ejb
15:23:52,020 INFO  [org.infinispan.CLUSTER] (MSC service thread 1-1)
ISPN000094: Received new cluster view for channel ejb: [iam-keycloak-2|8]
(3) [iam-keycloak-2, iam-keycloak-1, iam-keycloak-0]
15:23:52,020 INFO  [org.infinispan.CLUSTER] (MSC service thread 1-2)
ISPN000094: Received new cluster view for channel ejb: [iam-keycloak-2|8]
(3) [iam-keycloak-2, iam-keycloak-1, iam-keycloak-0]
15:23:52,028 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-1) ISPN000079: Channel ejb local address is iam-keycloak-0,
physical addresses are [10.xxx.x.149:7600]
15:23:52,028 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-2) ISPN000079: Channel ejb local address is iam-keycloak-0,
physical addresses are [10.xxx.x.149:7600]
15:23:52,084 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-2) ISPN000078: Starting JGroups channel ejb
15:23:52,084 INFO  [org.infinispan.CLUSTER] (MSC service thread 1-2)
ISPN000094: Received new cluster view for channel ejb: [iam-keycloak-2|8]
(3) [iam-keycloak-2, iam-keycloak-1, iam-keycloak-0]
15:23:52,085 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-2) ISPN000079: Channel ejb local address is iam-keycloak-0,
physical addresses are [10.xxx.x.149:7600]
15:23:52,090 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-1) ISPN000078: Starting JGroups channel ejb
15:23:52,091 INFO  [org.infinispan.CLUSTER] (MSC service thread 1-1)
ISPN000094: Received new cluster view for channel ejb: [iam-keycloak-2|8]
(3) [iam-keycloak-2, iam-keycloak-1, iam-keycloak-0]
15:23:52,092 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-1) ISPN000079: Channel ejb local address is iam-keycloak-0,
physical addresses are [10.xxx.x.149:7600]
15:23:52,118 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-2) ISPN000078: Starting JGroups channel ejb
15:23:52,118 INFO  [org.infinispan.CLUSTER] (MSC service thread 1-2)
ISPN000094: Received new cluster view for channel ejb: [iam-keycloak-2|8]
(3) [iam-keycloak-2, iam-keycloak-1, iam-keycloak-0]
15:23:52,119 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-2) ISPN000079: Channel ejb local address is iam-keycloak-0,
physical addresses are [10.xxx.x.149:7600]
15:23:52,277 INFO  [org.jboss.as.clustering.infinispan] (ServerService
Thread Pool -- 57) WFLYCLINF0002: Started authorization cache from keycloak
container
15:23:52,277 INFO  [org.jboss.as.clustering.infinispan] (ServerService
Thread Pool -- 56) WFLYCLINF0002: Started realms cache from keycloak
container
15:23:52,277 INFO  [org.jboss.as.clustering.infinispan] (ServerService
Thread Pool -- 61) WFLYCLINF0002: Started users cache from keycloak
container
15:23:52,277 INFO  [org.jboss.as.clustering.infinispan] (ServerService
Thread Pool -- 62) WFLYCLINF0002: Started keys cache from keycloak container
15:23:52,587 INFO  [org.jboss.as.clustering.infinispan] (ServerService
Thread Pool -- 63) WFLYCLINF0002: Started clientSessions cache from
keycloak container
15:23:52,588 INFO  [org.jboss.as.clustering.infinispan] (ServerService
Thread Pool -- 53) WFLYCLINF0002: Started offlineSessions cache from
keycloak container
15:23:52,593 INFO  [org.jboss.as.clustering.infinispan] (ServerService
Thread Pool -- 52) WFLYCLINF0002: Started work cache from keycloak container
15:23:52,593 INFO  [org.jboss.as.clustering.infinispan] (ServerService
Thread Pool -- 58) WFLYCLINF0002: Started sessions cache from keycloak
container
15:23:52,593 INFO  [org.jboss.as.clustering.infinispan] (ServerService
Thread Pool -- 59) WFLYCLINF0002: Started authenticationSessions cache from
keycloak container
15:23:52,593 INFO  [org.jboss.as.clustering.infinispan] (ServerService
Thread Pool -- 55) WFLYCLINF0002: Started actionTokens cache from keycloak
container
15:23:52,593 INFO  [org.jboss.as.clustering.infinispan] (ServerService
Thread Pool -- 60) WFLYCLINF0002: Started offlineClientSessions cache from
keycloak container
15:23:52,593 INFO  [org.jboss.as.clustering.infinispan] (ServerService
Thread Pool -- 54) WFLYCLINF0002: Started loginFailures cache from keycloak
container
15:23:52,593 INFO  [org.jboss.as.clustering.infinispan] (ServerService
Thread Pool -- 64) WFLYCLINF0002: Started client-mappings cache from ejb
container
15:23:53,596 INFO  [org.keycloak.services] (ServerService Thread Pool --
64) KC-SERVICES0001: Loading config from standalone.xml or domain.xml
15:23:53,926 INFO  [org.jboss.as.clustering.infinispan] (ServerService
Thread Pool -- 64) WFLYCLINF0002: Started realmRevisions cache from
keycloak container
15:23:53,930 INFO  [org.jboss.as.clustering.infinispan] (ServerService
Thread Pool -- 64) WFLYCLINF0002: Started userRevisions cache from keycloak
container
15:23:53,934 INFO [org.jboss.as.clustering.infinispan] (ServerService
Thread Pool -- 64) WFLYCLINF0002: Started authorizationRevisions cache from
keycloak container

I can see that the Helm chart also sets these clustering values when in HA
mode (
https://github.com/helm/charts/blob/master/stable/keycloak/values.yaml#L148-L149
-->
https://github.com/helm/charts/blob/master/stable/keycloak/scripts/ha.cli ):
/subsystem=infinispan/cache-container=keycloak/distributed-cache=sessions:write-attribute(name=owners,
value=${env.CACHE_OWNERS:2})
/subsystem=infinispan/cache-container=keycloak/distributed-cache=authenticationSessions:write-attribute(name=owners,
value=${env.CACHE_OWNERS:2})
/subsystem=infinispan/cache-container=keycloak/distributed-cache=offlineSessions:write-attribute(name=owners,
value=${env.CACHE_OWNERS:2})
/subsystem=infinispan/cache-container=keycloak/distributed-cache=clientSessions:write-attribute(name=owners,
value=${env.CACHE_OWNERS:2})
/subsystem=infinispan/cache-container=keycloak/distributed-cache=offlineClientSessions:write-attribute(name=owners,
value=${env.CACHE_OWNERS:2})
/subsystem=infinispan/cache-container=keycloak/distributed-cache=loginFailures:write-attribute(name=owners,
value=${env.CACHE_OWNERS:2})

/subsystem=jgroups/channel=ee:write-attribute(name=stack, value=tcp)

Not sure about the default value of "2" for the CACHE_OWNERS part. The ENV
variable CACHE_OWNERS is not set in the default values.yaml so the value
I'm currently using is 2 as well.
Because I have a 3 instance cluster, wouldn't I be better off with a value
of "3" in all of those configurations?

Any further ideas? How did you fix the issue in your configuration?
Thanks


Am Mi., 24. Apr. 2019 um 19:44 Uhr schrieb 张立强 <liqiang at fit2cloud.com>:

> Hi,
>
> I met the exactly same issue before long but I didn't use kube_ping.
> I suppose your pods didn't well clustered, to verify this you can check
> the logs which will
>  show you all numbers in the cluster.
>
> BTW this repo might give you some clue.
> https://github.com/zhangliqiang/keycloak-cluster-setup-and-configuration
>
>
>
>
> Jody H <j9dy1g at gmail.com>于2019年4月25日 周四上午1:07写道:
>
>> Hi,
>>
>> we have some trouble to generate tokens with the authentication code flow
>> in our Keycloak 5.0.0 cluster.
>> Some information about the cluster:
>> 1) We have a cluster with 3 instances in Kubernetes, deployed by the
>> Keycloak Helm Chart (
>> https://github.com/helm/charts/tree/master/stable/keycloak)
>> 2) I can see that some Infinispan stuff is going on in the logs when the
>> cluster is starting up. I have checked that the shell script that is
>> executed on startup contains the " -c standalone-ha.xml" switch. I can not
>> find any mentions of the string "standalone-ha.xml" in the log output
>> though.
>> 3) Our cluster is loadbalanced with a HAProxy
>> 4) The webservice we want to access is secured by Keycloak Gatekeeper (
>> https://github.com/keycloak/keycloak-gatekeeper)
>>
>> When using a browser to log in to keycloak-secured websites (i.e. websites
>> that use the keycloak cluster to perform the OIDC authentication code flow
>> and authenticate our users), we did not see problems so far. The keycloak
>> gatekeeper "proxy" is redirecting to keycloak when no cookie is present
>> for
>> login, trading in the code for id, access and refresh tokens and passing
>> the access_token to the reverse-proxied website after successful login.
>>
>> To test our APIs we would like to use Postman.
>> However, when using Postman with its built-in OAuth 2.0 authentication, we
>> see a problem that is reproducible on 4 laptops which are in the same LAN
>> as the keycloak cluster. Postman can request access tokens by using the
>> authentication code flow in its GUI. In Postmans "Get New Access Token"
>> window, we use these settings:
>> 1. callback url: the same redirect_uri that is pointing to the Keycloak
>> gatekeeper callback endpoint (/oauth/callback endpoint)
>> 2. auth url:
>> https://keycloak.domain/auth/realms/our-realm/protocol/openid-connect/auth
>> 3. access token url:
>>
>> https://keycloak.domain/auth/realms/our-realm/protocol/openid-connect/token
>> 4. client-id: client-id from Keycloak
>> 5. client-secet: client-secret from Keycloak
>> 6. scope: openid
>> 7. Client Authentication: "Send as Basic Auth header"
>>
>> When clicking the "Request Token" button in Postman, we receive the error
>> "invalid_code" in roughly 9 out of 10 tries. Basically, if we spam the
>> button, sometimes it works but most of the time it does not. For another
>> laptop which is connected via VPN and thus has a higher latency, the
>> requests work just fine.
>>
>> I am thinking about the following:
>> Is it possible that the initial request is sent to keycloak-0, then
>> returned to the client (postman) and then immediately sent back to the
>> loadbalancer-url to trade in the code for tokens... and then hits another
>> instance due to loadbalancing, for example keycloak-1, which has no
>> information about the authentication process that was initiated on
>> keycloak-0? The invalid_code error is returned after just 4 milliseconds,
>> which is rather fast. Maybe the cluster is not properly synchronizing in
>> time? Any idea on how to fix this?
>>
>> Thanks
>> Jody
>> _______________________________________________
>> keycloak-user mailing list
>> keycloak-user at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/keycloak-user
>>
> --
> 张立强
>
> 研发小头目, FIT2CLOUD
>
> http://fit2cloud.com | Mobile +86-18701062478 <%2B86-17710309880>
>
> 北京市朝阳区东三环中路7号财富中心写字楼A座715室
>
>
> 中国领先的云管理平台及服务提供商,助力企业 管好云 · 用好云 · 控好云
>


More information about the keycloak-user mailing list