We've recently setup Keycloak 7 in a clustered environment using multicast.
(standalone-ha)
Everything works great most of the time. But randomly, upon restarting Keycloak on one
server, keycloak will not be able to find the other instance.
We've called the servers keycloak01 and keycloak02. When the problem occurs, the
flow goes something like this:
Keycloak01 running
Keycloak02 running
Restart keycloak02
Keycloak2 reports: no members discovered after 3003 ms: creating cluster as first member
Nothing in keycloak01 logs suggests it knows about keycloak02 anymore
At this point keycloak02 can never talk to keycloak01 until keycloak01 is restarted.
After restarting keycloak01, everything is fine again.
Here are the keycloak01 logs when keycloak02 is restarting
<keycloak02 stopping>
2019-09-11 15:09:22,008 INFO [org.infinispan.CLUSTER] (remote-thread--p10-t3)
[Context=loginFailures] ISPN100008: Updating cache members list [keycloak01], topology id
16
2019-09-11 15:09:22,008 INFO [org.infinispan.CLUSTER] (remote-thread--p10-t14)
[Context=actionTokens] ISPN100008: Updating cache members list [keycloak01], topology id
16
2019-09-11 15:09:22,010 INFO [org.infinispan.CLUSTER] (remote-thread--p10-t12)
[Context=offlineClientSessions] ISPN100008: Updating cache members list [keycloak01],
topology id 16
2019-09-11 15:09:22,010 INFO [org.infinispan.CLUSTER] (remote-thread--p10-t11)
[Context=clientSessions] ISPN100008: Updating cache members list [keycloak01], topology id
16
2019-09-11 15:09:22,011 INFO [org.infinispan.CLUSTER] (remote-thread--p10-t13)
[Context=work] ISPN100008: Updating cache members list [keycloak01], topology id 16
2019-09-11 15:09:22,011 INFO [org.infinispan.CLUSTER] (remote-thread--p10-t10)
[Context=authenticationSessions] ISPN100008: Updating cache members list [keycloak01],
topology id 16
2019-09-11 15:09:22,011 INFO [org.infinispan.CLUSTER] (remote-thread--p10-t15)
[Context=offlineSessions] ISPN100008: Updating cache members list [keycloak01], topology
id 16
2019-09-11 15:09:22,012 INFO [org.infinispan.CLUSTER] (remote-thread--p10-t9)
[Context=sessions] ISPN100008: Updating cache members list [keycloak01], topology id 16
2019-09-11 15:09:22,012 INFO [org.infinispan.CLUSTER] (remote-thread--p8-t3)
[Context=client-mappings] ISPN100008: Updating cache members list [keycloak01], topology
id 16
2019-09-11 15:09:22,056 INFO [org.infinispan.CLUSTER] (thread-14,ejb,keycloak01)
ISPN000094: Received new cluster view for channel ejb: [keycloak01|6] (1) [keycloak01]
2019-09-11 15:09:22,057 INFO [org.infinispan.CLUSTER] (thread-14,ejb,keycloak01)
ISPN100001: Node keycloak02 left the cluster
2019-09-11 15:09:22,057 INFO [org.infinispan.CLUSTER] (thread-14,ejb,keycloak01)
ISPN000094: Received new cluster view for channel ejb: [keycloak01|6] (1) [keycloak01]
2019-09-11 15:09:22,057 INFO [org.infinispan.CLUSTER] (thread-14,ejb,keycloak01)
ISPN100001: Node keycloak02 left the cluster
2019-09-11 15:09:22,059 INFO [org.infinispan.CLUSTER] (thread-14,ejb,keycloak01)
ISPN000094: Received new cluster view for channel ejb: [keycloak01|6] (1) [keycloak01]
2019-09-11 15:09:22,059 INFO [org.infinispan.CLUSTER] (thread-14,ejb,keycloak01)
ISPN100001: Node keycloak02 left the cluster
2019-09-11 15:09:22,059 INFO [org.infinispan.CLUSTER] (thread-14,ejb,keycloak01)
ISPN000094: Received new cluster view for channel ejb: [keycloak01|6] (1) [keycloak01]
2019-09-11 15:09:22,059 INFO [org.infinispan.CLUSTER] (thread-14,ejb,keycloak01)
ISPN100001: Node keycloak02 left the cluster
2019-09-11 15:09:22,060 INFO [org.infinispan.CLUSTER] (thread-14,ejb,keycloak01)
ISPN000094: Received new cluster view for channel ejb: [keycloak01|6] (1) [keycloak01]
2019-09-11 15:09:22,060 INFO [org.infinispan.CLUSTER] (thread-14,ejb,keycloak01)
ISPN100001: Node keycloak02 left the cluster
<keycloak02 starting>
2019-09-11 15:09:26,679 INFO [org.infinispan.CLUSTER] (thread-14,ejb,keycloak01)
ISPN000094: Received new cluster view for channel ejb: [keycloak01|7] (2) [keycloak01,
keycloak02]
2019-09-11 15:09:26,679 INFO [org.infinispan.CLUSTER] (thread-14,ejb,keycloak01)
ISPN100000: Node keycloak02 joined the cluster
2019-09-11 15:09:26,679 INFO [org.infinispan.CLUSTER] (thread-14,ejb,keycloak01)
ISPN000094: Received new cluster view for channel ejb: [keycloak01|7] (2) [keycloak01,
keycloak02]
2019-09-11 15:09:26,680 INFO [org.infinispan.CLUSTER] (thread-14,ejb,keycloak01)
ISPN100000: Node keycloak02 joined the cluster
2019-09-11 15:09:26,680 INFO [org.infinispan.CLUSTER] (thread-14,ejb,keycloak01)
ISPN000094: Received new cluster view for channel ejb: [keycloak01|7] (2) [keycloak01,
keycloak02]
2019-09-11 15:09:26,680 INFO [org.infinispan.CLUSTER] (thread-14,ejb,keycloak01)
ISPN100000: Node keycloak02 joined the cluster
On a problematic restart keycloak01 logs are as follows:
<keycloak02 stopping>
2019-09-11 15:09:43,915 INFO [org.infinispan.CLUSTER] (remote-thread--p10-t14)
[Context=offlineClientSessions] ISPN100008: Updating cache members list [keycloak01],
topology id 21
2019-09-11 15:09:43,915 INFO [org.infinispan.CLUSTER] (remote-thread--p10-t15)
[Context=work] ISPN100008: Updating cache members list [keycloak01], topology id 21
2019-09-11 15:09:43,915 INFO [org.infinispan.CLUSTER] (remote-thread--p10-t9)
[Context=loginFailures] ISPN100008: Updating cache members list [keycloak01], topology id
21
2019-09-11 15:09:43,915 INFO [org.infinispan.CLUSTER] (remote-thread--p10-t10)
[Context=offlineSessions] ISPN100008: Updating cache members list [keycloak01], topology
id 21
2019-09-11 15:09:43,916 INFO [org.infinispan.CLUSTER] (remote-thread--p10-t13)
[Context=authenticationSessions] ISPN100008: Updating cache members list [keycloak01],
topology id 21
2019-09-11 15:09:43,916 INFO [org.infinispan.CLUSTER] (remote-thread--p10-t11)
[Context=sessions] ISPN100008: Updating cache members list [keycloak01], topology id 21
2019-09-11 15:09:43,916 INFO [org.infinispan.CLUSTER] (remote-thread--p10-t10)
[Context=actionTokens] ISPN100008: Updating cache members list [keycloak01], topology id
21
2019-09-11 15:09:43,916 INFO [org.infinispan.CLUSTER] (remote-thread--p10-t15)
[Context=clientSessions] ISPN100008: Updating cache members list [keycloak01], topology id
21
2019-09-11 15:10:03,902 INFO [org.infinispan.CLUSTER] (remote-thread--p8-t3)
[Context=client-mappings] ISPN100008: Updating cache members list [keycloak01], topology
id 21
2019-09-11 15:10:03,912 INFO [org.infinispan.CLUSTER] (thread-23,ejb,keycloak01)
ISPN000094: Received new cluster view for channel ejb: [keycloak01|8] (1) [keycloak01]
2019-09-11 15:10:03,913 INFO [org.infinispan.CLUSTER] (thread-23,ejb,keycloak01)
ISPN100001: Node keycloak02 left the cluster
2019-09-11 15:10:03,913 INFO [org.infinispan.CLUSTER] (thread-23,ejb,keycloak01)
ISPN000094: Received new cluster view for channel ejb: [keycloak01|8] (1) [keycloak01]
2019-09-11 15:10:03,914 INFO [org.infinispan.CLUSTER] (thread-23,ejb,keycloak01)
ISPN100001: Node keycloak02 left the cluster
2019-09-11 15:10:03,915 INFO [org.infinispan.CLUSTER] (thread-23,ejb,keycloak01)
ISPN000094: Received new cluster view for channel ejb: [keycloak01|8] (1) [keycloak01]
2019-09-11 15:10:03,915 INFO [org.infinispan.CLUSTER] (thread-23,ejb,keycloak01)
ISPN100001: Node keycloak02 left the cluster
2019-09-11 15:10:03,916 INFO [org.infinispan.CLUSTER] (thread-23,ejb,keycloak01)
ISPN000094: Received new cluster view for channel ejb: [keycloak01|8] (1) [keycloak01]
2019-09-11 15:10:03,916 INFO [org.infinispan.CLUSTER] (thread-23,ejb,keycloak01)
ISPN100001: Node keycloak02 left the cluster
2019-09-11 15:10:03,916 INFO [org.infinispan.CLUSTER] (thread-23,ejb,keycloak01)
ISPN000094: Received new cluster view for channel ejb: [keycloak01|8] (1) [keycloak01]
2019-09-11 15:10:03,916 INFO [org.infinispan.CLUSTER] (thread-23,ejb,keycloak01)
ISPN100001: Node keycloak02 left the cluster
<keycloak02 starting>
... no more logs will show up and keycloak02 reports no members
Notice the time between [Context=clientSessions] and [Context=client-mappings] is 20
seconds on this problematic restart vs the successful one.
Anyone have any ideas?