On Wed, 12 Sep 2018 at 10:09, Sebastian Laskawiec <slaskawi(a)redhat.com>
wrote:
Hey guys,
During our weekly sync meeting, Stian asked me to look into different
options for clustering in Keycloak server. This topic has quite hot with
the context of our Docker image (see the proposed community contributions
[1][2][3]). Since we are based on WF 13, which uses JGroups 4.0.11 and has
KUBE_PING in its modules, we have a couple of options how to do it.
Before discussing different implementations, let me quickly go through the
requirements:
- We need a configuration stack that works for on-prem and cloud
deployments with OpenShift as our primary target.
OpenShift is not our primary target in community. We should have support
for standalone Docker, Kubernetes and OpenShift. For GCP and EC2 we should
have something that works, but will probably not be able to test this
ourselves.
- The configuration should be automatic (if it's possible). E.g.
if we
discover that Keycloak is running in the container, we should use proper
discovery protocol.
- There needs to be a way to override the discovery protocol manually.
With those requirements in mind, we have a couple of implementation options
on the table:
1. Add more stacks to the configuration, e.g. openshift, azure or gcp. Then
we use the standard `-Djboss.default.jgroups.stack=<stack>` configuration
switch.
2. Provide more standalone-*.xml configuration files, e.g.
standalone-ha.xml (for on-prem) or standalone-cloud.xml.
3. Add protocols dynamically using CLI. A similar approach to what we did
for the Data Grid Cache Service [4].
4. Use MULTI_PING protocols [5][6], with multiple discovery protocols on
the same stack. This will include MPING (for multicasting), KUBE_PING (if
we can access Kubernetes API), DNS_PING (if Pods are governed by a
Service).
All config should be done through CLI commands. Did you look at how we do
this for DBs? The standalone-*.xml file is modified at runtime to switch
between selected DBs.
To keep things consistent your options are:
* A single CLI script that configures something that works in all scenarios
* A single CLI script that configures alternatives stacks that are enabled
depending on some environment variables
* The same approach as done for DBs where we have different directories for
each DB with different CLI scripts for each DB. This DB specific CLI is
then executed at startup time to re-configure standalone.xml
Anything else will be a maintenance headache.
Option #1 and #2 is somewhat similar to what we did for Infinispan [7]. It
works quite well but the configuration grows quite quickly and most of the
protocols (apart from discovery) are duplicated. On the other hand, having
separate configuration pieces for each use case is very flexible. Having in
mind that AWS cuts TCP connections, using FD_SOCK might lead to false
suspicions but on GCP for the instance, FD_SOCK works quite nicely. The CLI
option (#3), is also very flexible and probably should be implemented only
in our Docker image. This somehow follows the convention we already started
with different CLI files for different DBs [8]. Option #4 is brand new
(implemented in JGroups 4.0.8; we have 4.0.11 as you probably recall). It
has been specifically designed for this kind of use cases where we want to
gather discovery data from multiple places. Using this way, we should end
up with two stacks in standalone-ha.xml file - UDP and TCP.
I honestly need to say, that my heart goes for options #4. However, as far
as I know it hasn't been battle tested and we might get some surprises. All
other options are not as elegant as option #4 but they are used somewhere
in other projects. They are much safer options but they will add some
maintenance burden on our shoulders.
I do like the idea of option #4, but worried about it not working and
causing strange behaviour.