In our deployments, we have need for the individual nodes in our standalone-ha cluster to
be able to be stopped and restarted at any time, without loss of user sessions. We deploy
into AWS Elastic Beanstalk, and join the "new" nodes into the existing JGroups
cluster of "old" nodes, switch CNAMEs, then shut down the "old" nodes.
The default suggested model of using <distributed-cache> for the session and
clientSession caches fails in this scenario, as sessions stored only in memory on the
"old" nodes are lost. As such, we've switched to <replicated-cache>
for most of the caches. Does this make sense? Are most people using Keycloak running
>10 nodes per cluster such that replicated cache doesn't make sense for performance
reasons? If not, can the example standalone-ha.xml be adjusted to use replicated-cache?
When items are placed into the caches, it seems that they are not put() with a lifetime or
expiry time associated with them. This has led us to run into out-of-memory issues as the
sizes of these caches grows without bound. We've added expiration times in the XML
config to adjust for maximum lifetimes based on the longest session timeouts across all
realms. This seems not ideal. It would be better to put() including the current active
session timeouts for the realm for the object from which it originates.
We cannot evict items in the sessions and clientSessions caches as they are not persisted
to disk. The Offline variants are persisted to disk, so evicting these seem OK. Should
this then be an invalidation-cache rather than a replicated-cache?
XML snippet:
<subsystem xmlns="urn:jboss:domain:infinispan:4.0">
<cache-container name="keycloak"
jndi-name="infinispan/Keycloak" statistics-enabled="true">
<replicated-cache name="sessions" mode="SYNC"
statistics-enabled="true">
<!-- NOT persisted to the database, so cannot evict within their
valid lifetime -->
<!-- max-idle = refresh_token:1 day -->
<expiration max-idle="86700000"
interval="300000"/>
</replicated-cache>
<replicated-cache name="clientSessions" mode="SYNC"
statistics-enabled="true">
<!-- NOT persisted to the database, so cannot evict within their
valid lifetime -->
<!-- max-idle = refresh_token:1 day -->
<expiration max-idle="86700000"
interval="300000"/>
</replicated-cache>
<replicated-cache name="authenticationSessions"
mode="SYNC" statistics-enabled="true">
<!-- NOT persisted to the database, so cannot evict within their
valid lifetime -->
<!-- max-idle = login timeout:1 day -->
<expiration max-idle="86700000"
interval="300000"/>
</replicated-cache>
<replicated-cache name="offlineSessions"
mode="SYNC" statistics-enabled="true">
<!-- max-idle = refresh_token:30 days -->
<eviction max-entries="10000"
strategy="LRU"/>
<expiration max-idle="2592000000"
interval="300000"/>
</replicated-cache>
<replicated-cache name="offlineClientSessions"
mode="SYNC" statistics-enabled="true">
<eviction max-entries="10000"
strategy="LRU"/>
<expiration max-idle="2592000000"
interval="300000"/>
</replicated-cache>
Thanks,
Matt
--
Matt Domsch
Executive Director & Senior Distinguished Engineer
Quest | Engineering
Matt.Domsch@quest.com<mailto:Matt.Domsch@quest.com>
Mobile 512.981.6486