[keycloak-user] Not able to setup Keycloak to fully replicate user sessions in cluster

Marek Posolda mposolda at redhat.com
Thu Jun 15 03:15:12 EDT 2017


Yes, it looks like the issue in clustered communication in your 
environment. See infinispan/jgroups docs and google for "split brain" 
error message below you have. I guess it may help to reconfigure your 
cluster to use TCP based jgroups channel instead of multicast. But not 
sure at 100%...

Marek


On 14/06/17 06:49, Jyoti Kumar Singh wrote:
> Hi Team,
>
> Is there any recommendation for me to look upon?
>
> On Jun 10, 2017 10:47 AM, "Jyoti Kumar Singh" <jyoti.tech90 at gmail.com>
> wrote:
>
>> Hi Stian,
>>
>> Thanks for the reply. I am using below configuration of the
>> standalone-ha.xml from 3.1.0 version. I just added owners="2" in
>> "infinispan/Keycloak" for cluster-wide replicas for each cache entry.
>>
>> #standalone-ha.xml:- attached
>>
>> Also I am using DC/OS as a container platform, which includes Marathon as
>> a load balancer (LB) and two container runtimes (Docker and Mesos) for the
>> deployment on cloud.
>>
>> I could see below logs are rolling in Node#2(nodeagent16) once
>> Node#1(nodeagent15) goes down. But when I am bringing Node#1 again, request
>> is being transferred from LB to Node#1 again and I am not seeing any logs
>> related to Cache session are rolling in Node#1, hence user's session is not
>> recognized by Node#1 and he is asked to login again.
>>
>> Currently I am not very sure whether multicasting is not working or
>> discovery protocol is having some issue. Your inputs will help me to
>> understand the issue in a better way.
>>
>> #Logs:-
>>
>> 2017-06-10 04:41:56,330 WARN  [org.infinispan.partitionhandling.impl.PreferAvailabilityStrategy]
>> (transport-thread--p16-t3) [nodeagent16] KEYCLOAK 3.1.0-0.1 ISPN000314:
>> Cache authorization lost at least half of the stable members, possible
>> split brain causing data inconsistency. Current members are [nodeagent16],
>> lost members are [nodeagent15], stable members are [nodeagent15,
>> nodeagent16]
>> 2017-06-10 04:41:56,332 WARN  [org.infinispan.partitionhandling.impl.PreferAvailabilityStrategy]
>> (transport-thread--p16-t3) [nodeagent16] KEYCLOAK 3.1.0-0.1 ISPN000314:
>> Cache sessions lost at least half of the stable members, possible split
>> brain causing data inconsistency. Current members are [nodeagent16], lost
>> members are [nodeagent15], stable members are [nodeagent15, nodeagent16]
>> 2017-06-10 04:41:56,333 WARN  [org.infinispan.partitionhandling.impl.PreferAvailabilityStrategy]
>> (transport-thread--p16-t3) [nodeagent16] KEYCLOAK 3.1.0-0.1 ISPN000314:
>> Cache work lost at least half of the stable members, possible split brain
>> causing data inconsistency. Current members are [nodeagent16], lost members
>> are [nodeagent15], stable members are [nodeagent16, nodeagent15]
>> 2017-06-10 04:41:56,334 WARN  [org.infinispan.partitionhandling.impl.PreferAvailabilityStrategy]
>> (transport-thread--p16-t3) [nodeagent16] KEYCLOAK 3.1.0-0.1 ISPN000314:
>> Cache offlineSessions lost at least half of the stable members, possible
>> split brain causing data inconsistency. Current members are [nodeagent16],
>> lost members are [nodeagent15], stable members are [nodeagent15,
>> nodeagent16]
>> 2017-06-10 04:41:56,336 WARN  [org.infinispan.partitionhandling.impl.PreferAvailabilityStrategy]
>> (transport-thread--p16-t3) [nodeagent16] KEYCLOAK 3.1.0-0.1 ISPN000314:
>> Cache loginFailures lost at least half of the stable members, possible
>> split brain causing data inconsistency. Current members are [nodeagent16],
>> lost members are [nodeagent15], stable members are [nodeagent15,
>> nodeagent16]
>> 2017-06-10 04:41:56,509 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>> (Incoming-2,ee,nodeagent16) [nodeagent16] KEYCLOAK 3.1.0-0.1 ISPN000094:
>> Received new cluster view for channel web: [nodeagent16|10] (1)
>> [nodeagent16]
>> 2017-06-10 04:41:56,512 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>> (Incoming-2,ee,nodeagent16) [nodeagent16] KEYCLOAK 3.1.0-0.1 ISPN000094:
>> Received new cluster view for channel ejb: [nodeagent16|10] (1)
>> [nodeagent16]
>> 2017-06-10 04:41:56,513 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>> (Incoming-2,ee,nodeagent16) [nodeagent16] KEYCLOAK 3.1.0-0.1 ISPN000094:
>> Received new cluster view for channel hibernate: [nodeagent16|10] (1)
>> [nodeagent16]
>>
>>
>>
>> On Fri, Jun 9, 2017 at 10:28 AM, Stian Thorgersen <sthorger at redhat.com>
>> wrote:
>>
>>> Your configuration is not correct and seems to be from an older version
>>> of Keycloak. Please take a look at default standalone-ha.xml from 3.1 for
>>> the correct cache configs.
>>>
>>> You also need to get cluster communication working properly. Make sure
>>> the nodes see each other. When you start new nodes something should happen
>>> in the log in other nodes. In a cloud environment this can be tricky (you
>>> haven't said which one) as multicasting usually doesn't work and you need
>>> to use a different discovery protocol.
>>>
>>> On 7 June 2017 at 16:17, Jyoti Kumar Singh <jyoti.tech90 at gmail.com>
>>> wrote:
>>>
>>>> Hi Team,
>>>>
>>>> We are setting up keycloak:3.1.0.Final in a cluster mode for HA with full
>>>> user sessions replication in a cloud system, i.e. when one node goes down
>>>> then user will keep logged in on other node.
>>>>
>>>> I have setup cluster by using standalone-ha.xml and having infinispan
>>>> cache
>>>> as mentioned below:-
>>>>
>>>> <cache-container name="keycloak" jndi-name="infinispan/Keycloak">
>>>>                  <transport lock-timeout="60000"/>
>>>>                  <invalidation-cache name="realms" mode="SYNC"/>
>>>>                  <invalidation-cache name="users" mode="SYNC"/>
>>>>                  <distributed-cache name="sessions" mode="SYNC"
>>>> owners="2"/>
>>>>                  <distributed-cache name="loginFailures" mode="SYNC"
>>>> owners="2"/>
>>>> </cache-container>
>>>>
>>>> Every thing works fine except below use case:-
>>>>
>>>> 1. Node 1 and Node 2 both are up and user logged in - User session is
>>>> getting generated by Node 1
>>>> 2. Node 1 is now stopped and user session is getting replicated in Node
>>>> 2 -
>>>> User is still able to use the Keycloak console
>>>> 3. Node 1 is up again and request is being transferred from LB to Node 1
>>>> -
>>>> User is asked to log in again because session cache is not replicated to
>>>>      Node 1 immediately once it is up
>>>>
>>>> I saw one option to add *start="EAGER" *in cache-container to fix this
>>>> but
>>>> looks like with latest version of WildFly it is no longer supported. Do
>>>> we
>>>> have any other way to fix this issue ?
>>>>
>>>>
>>>> --
>>>>
>>>> *With Regards, Jyoti Kumar Singh*
>>>> _______________________________________________
>>>> keycloak-user mailing list
>>>> keycloak-user at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/keycloak-user
>>>>
>>>
>>
>> --
>>
>> *With Regards, Jyoti Kumar Singh*
>>
> _______________________________________________
> keycloak-user mailing list
> keycloak-user at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/keycloak-user




More information about the keycloak-user mailing list