[keycloak-user] Cross-DC Replication not working for `sessions` cache

Hayden Fuss hfuss at bandwidth.com
Tue Aug 21 09:12:27 EDT 2018


Hey guys,

Thank you for the updates! We'll stick to Infinispan 8.2.8 so that there an
no surprises.

We upgraded JGroups 3 and added KUBE_PING to Infinispan 8.2.8, as well as
for Keycloak, and so we've gotten cross-DC working with two Keycloaks and
two ISPN's in each DC.

In our first round of HA testing, Keycloak's OIDC endpoints have been
fairly resilient when unable to connect LDAP, MariaDB, and the whole ISPN
cluster (we just destroy the OpenShift Services and wait 5 minutes while
testing the endpoints). However, we've noticed if we delete a ISPN pod
forcefully, we'll experience some timeouts with the
/token?username&password grant as the *new *pod comes up.

We believe its due to our liveness/readiness probes being too optimistic
since ISPN 8.2.8 does not have a health check like ISPN 9.X. I've been
unable to find a prescribed way of health checking ISPN 8.2.8.

For now I'm waiting for the 9990 socket to open as the liveness probe, and
reusing the is_running.sh from ISPN 9.X for the readiness probe (attached),
and ISPN pods are considered "Ready" to receive traffic from the OpenShift
Service much sooner than they were when we used the probes that came with
ISPN 9.X. Aside from setting the delay on the probes to be longer, do
either of you know a more accurate way to health check ISPN 8.2.8?

Thanks again for the time and info. We greatly appreciate it as its been
very helpful!

Best,
Hayden

On Tue, Aug 21, 2018 at 5:26 AM Marek Posolda <mposolda at redhat.com> wrote:

> On 11/08/18 14:26, Sebastian Laskawiec wrote:
>
>
>
> pt., 10.08.2018, 21:59 użytkownik Hayden Fuss <hfuss at bandwidth.com>
> napisał:
>
>> Hello Sebastian and Marek,
>>
>> Thank you very much for suggestions. We had confirmed replication across
>> the ISPN clusters was working with the CLI, so we tried attaching the
>> remote debugger but didn't find anything useful to tell us why Keycloak
>> couldn't remotely store the sessions in the ISPN cluster.
>>
>
> Thanks for letting us know.
>
>
>> Based on what Marek described, we decided to downgrade our ISPN cluster
>> to 8.2.8 rather than use 9.3.1 and incorporate the demo code. It was our
>> understanding that demo code would provide an SPI that enabled the ISPN
>> cluster for persistent user storage (but not realms, clients, keys) which
>> is not desirable for us as of now.
>>
>
> Hmmm that's pretty interesting. For the Summit demo we used a fresh master
> build. So ISPN 9.x should work without any problems. Perhaps Marek can shed
> some light on this issue.
>
> The current Keycloak master supports cross-dc integration with infinispan
> server 8.2.8.Final and JDG 7.1. That's what we are testing and what is
> officially described as recommended infinispan-server version in our
> documentation:
> https://www.keycloak.org/docs/latest/server_installation/index.html#crossdc-mode
>
> In the recent PR for upgrade Keycloak to Wildfly 13, there will be the
> upgrade to JDG 7.2 and infinispan server to 9.2.4.Final (this is same as
> the infinispan version in the Wildfly 13).
>
> The summit demo used the infinispan server 9.3 AFAIR, but this required
> some updates in the Keycloak code, which was done by overriding default
> userSessions to the "updated-infinispan" provider. The code of this
> updated-infinispan is in the rh-sso project sources:
>
> https://github.com/rhdemo/rh-sso/blob/master/standalone-openshift-cfg/configuration/standalone-openshift-jdg.xml#L676-L681
>
> Even with this overriden provider, I've tested just the Keycloak parts,
> which were needed for the demo itself. I did not try to run our cross-dc
> automated tests. So no guarantee that everything works as expected.
>
> In other words, if you have a choice for the infinispan-server version and
> you don't need infinispan-server 9.X, it's recommended to stay with the
> infinispan-server 8.2.8.
>
> Marek
>
>
> BTW, do you have a demo pushed into some repo, so that we could check it
> out?
>
>
>> Downgrading to 8.2.8 (had to create our own image
>> https://github.com/brix4dayz/infinispan/tree/8.2.x) fixed our sessions
>> replication issue, the only thing is KUBE_PING/DNS_PING isn't available
>> with the JGroups version that comes with 8.2.8. Based on what I'm seeing
>> from this PR https://github.com/jboss-dockerfiles/keycloak/pull/96/files
>> its possible to add a newer version of JGroups to Keycloak so I'll attempt
>> to do that for ISPN so we can have local clustering for ISPN and Keycloak
>> in OpenShift.
>>
>
> Kube ping has basically two versions, 1.x which requires JGroups 4 and
> 0.9.x, which works with JGroups 3 and 4. Let me know if you hit any
> problems incorporating kube ping into your project. I might be able to help
> you.
>
>
>> If there's a better way to go about the JGroups version issue let us
>> know. Thanks again!
>>
>
> TBH I'm really interested why keycloak doesn't store sessions in ISPN. In
> my opinion, we should find out how to fix this problem and stay with ISPN
> 9. I would recommend downgrading ISPN as the last resort approach.
>
>
>> Best,
>> Hayd
>>
>> On Thu, Aug 9, 2018 at 3:27 AM Marek Posolda <mposolda at redhat.com> wrote:
>>
>>> Hi,
>>>
>>> I didn't check everything, but one thing I noted is, that in your
>>> keycloak-standalone-ha.xml, you don't have "alternative" providers
>>> configured.
>>>
>>> For Keycloak to work with the infinispan 9.2.X server or newer, it was
>>> needed to configure providers like this:
>>>
>>> https://github.com/rhdemo/rh-sso/blob/master/standalone-openshift-cfg/configuration/standalone-openshift-jdg.xml#L676-L681
>>> .
>>>
>>> There is also a need to add the userStorage to your realm, which can be
>>> done through admin console or by importing the realm. See:
>>> https://github.com/rhdemo/rh-sso/blob/master/realm-summit.json#L1051
>>>
>>> Marek
>>>
>>>
>>> On 08/08/18 15:07, Sebastian Laskawiec wrote:
>>> > On Tue, Aug 7, 2018 at 3:28 PM Hayden Fuss <hfuss at bandwidth.com>
>>> wrote:
>>> >
>>> >> Hello,
>>> >>
>>> >> We are attempting to run Keycloak on two OpenShift clusters using
>>> remote
>>> >> ISPNs and a single MariaDB instance. We're hacking together the
>>> Keycloak on
>>> >> Openshift blogpost, the JDG-as-a-service demo from Summit, RH SSO
>>> demo from
>>> >> Summit, and following the Keycloak/RH SSO basic setup guide to
>>> Cross-DC
>>> >> replication. The hope is do an initial evaluation of Keycloak's
>>> >> availability.
>>> >>
>>> >> We were able to create a new user on master (site1), disable the user
>>> on
>>> >> master2 (site2), and see the user was disabled on master. So ISPN
>>> >> replication seems to be working because the work cache was replicated
>>> to
>>> >> invalidate the local caches. However, the sessions cache does not
>>> seem to
>>> >> be replicated because when logged in as the same user on the two
>>> different
>>> >> Keycloaks (in Incognito mode) there is only one active session shown
>>> on
>>> >> both UIs and the timestamp/IP/etc is different for the listed session.
>>> >>
>>> > So at this point the Infinispan cluster within a single DC works
>>> correctly
>>> > [1] (the one that is formed by KUBE_PING). The Cross-DC cluster (also
>>> known
>>> > as the Global Cluster) also works correctly [2]. Users cache replicates
>>> > fine but sessions don't.
>>> >
>>> > If I understood everything correctly, there might be two issues there.
>>> >
>>> > The first one is Infinispan misconfiguration (I briefly looked through
>>> the
>>> > configuration and can not spot any mistake but there might be some
>>> typo or
>>> > anything like that). That one is easy to be verified, just put an
>>> entry on
>>> > one node (e.g. using REST [3]) and see if it's available on the other
>>> one
>>> > (again, using REST for example [4]).
>>> >
>>> > If this test works fine, you can check if Keycloak forwards traffic to
>>> the
>>> > Infinispan cluster. The easiest way is to set a breakpoint somewhere
>>> > in
>>> org.keycloak.models.sessions.infinispan.changes.sessions.LastSessionRefreshChecker#shouldSaveClientSessionToRemoteCache
>>> > and
>>> org.keycloak.models.sessions.infinispan.changes.sessions.LastSessionRefreshChecker#shouldSaveUserSessionToRemoteCache.
>>> >
>>> > [1] can be verified by calling `oc logs infinispan-app | grep view`
>>> > [2] can be verified by calling `oc logs infinispan-app | grep "x-site"`
>>> > [3] curl -d test ISPN_IP:8080/rest/sessions/test
>>> > [4] curl ISPN_IP2:8080/rest/sessions/test
>>> >
>>> >
>>> >> We are using the latest, stable Keycloak image, version 4.1.0.Final,
>>> and
>>> >> the latest, stable Infinispan image for to act as our data grid,
>>> version
>>> >> 9.3.1.Final, which we know differs from the 8.2.8 version Keycloak
>>> uses for
>>> >> its local caches.
>>> >>
>>> >> We were trying one Keycloak node and two ISPN nodes in each cluster,
>>> but
>>> >> for simplicity we've attached logs where we only ran one Keycloak and
>>> one
>>> >> ISPN in each cluster.
>>> >> We were connecting to the two different Keycloaks via two different
>>> >> OpenShift Routes without a load balancer to fake sticky sessions for
>>> now.
>>> >> Keycloak connects to ISPN via a "HotRod" Service. ISPN connects to
>>> other
>>> >> nodes within the same cluster via KUBE_PING, and discovers the other
>>> >> cluster via TCPPING hitting a particular OpenShift app node from that
>>> >> cluster that exposes the "discovery" Service with a NodePort. The
>>> Keycloaks
>>> >> share the single MariaDB through a NodePort Service in one of the
>>> clusters
>>> >> as well.
>>> >>
>>> >> The logs didn't seem to contain any of the messages in the trouble
>>> shooting
>>> >> guide. We had trouble using JMX to check the ISPNs because they were
>>> >> running in containers, but we've using the CLI tool and the Infinispan
>>> >> management console to try to troubleshoot but any key we pulled from
>>> the
>>> >> logs that we thought was a session ID was not in the caches and we
>>> could
>>> >> not find a way to simply list all keys in the caches.
>>> >>
>>> >> Below is a viewable link to a zip containing logs from the scenario
>>> >> described in the second paragraph, and our config files.
>>> >>
>>> >>
>>> >>
>>> https://drive.google.com/open?id=0B_OCdNCEtoCYOU12T3dEUFplS193VFNFbEFYclB4Tm5WR0o4
>>> >>
>>> >> Thanks for your time and help!
>>> >>
>>> >> Best,
>>> >> Hayden
>>> >> _______________________________________________
>>> >> keycloak-user mailing list
>>> >> keycloak-user at lists.jboss.org
>>> >> https://lists.jboss.org/mailman/listinfo/keycloak-user
>>> >>
>>> > _______________________________________________
>>> > keycloak-user mailing list
>>> > keycloak-user at lists.jboss.org
>>> > https://lists.jboss.org/mailman/listinfo/keycloak-user
>>>
>>>
>>>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: is_running.sh
Type: text/x-sh
Size: 218 bytes
Desc: not available
Url : http://lists.jboss.org/pipermail/keycloak-user/attachments/20180821/f4dcecc7/attachment-0001.bin 


More information about the keycloak-user mailing list