[keycloak-dev] Clustering configuration

Wed Sep 12 04:00:05 EDT 2018

Hey guys,

During our weekly sync meeting, Stian asked me to look into different
options for clustering in Keycloak server. This topic has quite hot with
the context of our Docker image (see the proposed community contributions
[1][2][3]). Since we are based on WF 13, which uses JGroups 4.0.11 and has
KUBE_PING in its modules, we have a couple of options how to do it.

Before discussing different implementations, let me quickly go through the
requirements:
- We need a configuration stack that works for on-prem and cloud
deployments with OpenShift as our primary target.
- The configuration should be automatic (if it's possible). E.g. if we
discover that Keycloak is running in the container, we should use proper
discovery protocol.
- There needs to be a way to override the discovery protocol manually.

With those requirements in mind, we have a couple of implementation options
on the table:
1. Add more stacks to the configuration, e.g. openshift, azure or gcp. Then
we use the standard `-Djboss.default.jgroups.stack=<stack>` configuration
switch.
2. Provide more standalone-*.xml configuration files, e.g.
standalone-ha.xml (for on-prem) or standalone-cloud.xml.
3. Add protocols dynamically using CLI. A similar approach to what we did
for the Data Grid Cache Service [4].
4. Use MULTI_PING protocols [5][6], with multiple discovery protocols on
the same stack. This will include MPING (for multicasting), KUBE_PING (if
we can access Kubernetes API), DNS_PING (if Pods are governed by a Service).

Option #1 and #2 is somewhat similar to what we did for Infinispan [7]. It
works quite well but the configuration grows quite quickly and most of the
protocols (apart from discovery) are duplicated. On the other hand, having
separate configuration pieces for each use case is very flexible. Having in
mind that AWS cuts TCP connections, using FD_SOCK might lead to false
suspicions but on GCP for the instance, FD_SOCK works quite nicely. The CLI
option (#3), is also very flexible and probably should be implemented only
in our Docker image. This somehow follows the convention we already started
with different CLI files for different DBs [8]. Option #4 is brand new
(implemented in JGroups 4.0.8; we have 4.0.11 as you probably recall). It
has been specifically designed for this kind of use cases where we want to
gather discovery data from multiple places. Using this way, we should end
up with two stacks in standalone-ha.xml file - UDP and TCP.

I honestly need to say, that my heart goes for options #4. However, as far
as I know it hasn't been battle tested and we might get some surprises. All
other options are not as elegant as option #4 but they are used somewhere
in other projects. They are much safer options but they will add some
maintenance burden on our shoulders.

What would you suggest guys? What do you think about all this? @Rado,
@Paul, @Tristan - Do you have any plans regarding this piece in Wildfly or
Infinispan?

Thanks,
Sebastian

[1] https://github.com/jboss-dockerfiles/keycloak/pull/96
[2] https://github.com/jboss-dockerfiles/keycloak/pull/100
[3] https://github.com/jboss-dockerfiles/keycloak/pull/116
[4]
https://github.com/jboss-container-images/datagrid-7-image/blob/datagrid-services-dev/modules/os-datagrid-online-services-configuration/src/main/bash/profiles/caching-service.cli#L37
[5] http://www.jgroups.org/manual4/index.html#_multi_ping
[6] https://issues.jboss.org/browse/JGRP-2224
[7]
https://github.com/infinispan/infinispan/tree/master/server/integration/jgroups/src/main/resources/subsystem-templates
[8]
https://github.com/jboss-dockerfiles/keycloak/tree/master/server/tools/cli/databases