[keycloak-user] Not able to setup Keycloak to fully replicate user sessions in cluster

Jyoti Kumar Singh jyoti.tech90 at gmail.com
Thu Jun 15 03:44:57 EDT 2017


Hi Marek,

Thanks for the reply. I will try the options suggested by you and will
revert back in case of any doubts.


On Thu, Jun 15, 2017 at 12:45 PM, Marek Posolda <mposolda at redhat.com> wrote:

> Yes, it looks like the issue in clustered communication in your
> environment. See infinispan/jgroups docs and google for "split brain" error
> message below you have. I guess it may help to reconfigure your cluster to
> use TCP based jgroups channel instead of multicast. But not sure at 100%...
>
> Marek
>
>
>
> On 14/06/17 06:49, Jyoti Kumar Singh wrote:
>
>> Hi Team,
>>
>> Is there any recommendation for me to look upon?
>>
>> On Jun 10, 2017 10:47 AM, "Jyoti Kumar Singh" <jyoti.tech90 at gmail.com>
>> wrote:
>>
>> Hi Stian,
>>>
>>> Thanks for the reply. I am using below configuration of the
>>> standalone-ha.xml from 3.1.0 version. I just added owners="2" in
>>> "infinispan/Keycloak" for cluster-wide replicas for each cache entry.
>>>
>>> #standalone-ha.xml:- attached
>>>
>>> Also I am using DC/OS as a container platform, which includes Marathon as
>>> a load balancer (LB) and two container runtimes (Docker and Mesos) for
>>> the
>>> deployment on cloud.
>>>
>>> I could see below logs are rolling in Node#2(nodeagent16) once
>>> Node#1(nodeagent15) goes down. But when I am bringing Node#1 again,
>>> request
>>> is being transferred from LB to Node#1 again and I am not seeing any logs
>>> related to Cache session are rolling in Node#1, hence user's session is
>>> not
>>> recognized by Node#1 and he is asked to login again.
>>>
>>> Currently I am not very sure whether multicasting is not working or
>>> discovery protocol is having some issue. Your inputs will help me to
>>> understand the issue in a better way.
>>>
>>> #Logs:-
>>>
>>> 2017-06-10 04:41:56,330 WARN  [org.infinispan.partitionhandl
>>> ing.impl.PreferAvailabilityStrategy]
>>> (transport-thread--p16-t3) [nodeagent16] KEYCLOAK 3.1.0-0.1 ISPN000314:
>>> Cache authorization lost at least half of the stable members, possible
>>> split brain causing data inconsistency. Current members are
>>> [nodeagent16],
>>> lost members are [nodeagent15], stable members are [nodeagent15,
>>> nodeagent16]
>>> 2017-06-10 04:41:56,332 WARN  [org.infinispan.partitionhandl
>>> ing.impl.PreferAvailabilityStrategy]
>>> (transport-thread--p16-t3) [nodeagent16] KEYCLOAK 3.1.0-0.1 ISPN000314:
>>> Cache sessions lost at least half of the stable members, possible split
>>> brain causing data inconsistency. Current members are [nodeagent16], lost
>>> members are [nodeagent15], stable members are [nodeagent15, nodeagent16]
>>> 2017-06-10 04:41:56,333 WARN  [org.infinispan.partitionhandl
>>> ing.impl.PreferAvailabilityStrategy]
>>> (transport-thread--p16-t3) [nodeagent16] KEYCLOAK 3.1.0-0.1 ISPN000314:
>>> Cache work lost at least half of the stable members, possible split brain
>>> causing data inconsistency. Current members are [nodeagent16], lost
>>> members
>>> are [nodeagent15], stable members are [nodeagent16, nodeagent15]
>>> 2017-06-10 04:41:56,334 WARN  [org.infinispan.partitionhandl
>>> ing.impl.PreferAvailabilityStrategy]
>>> (transport-thread--p16-t3) [nodeagent16] KEYCLOAK 3.1.0-0.1 ISPN000314:
>>> Cache offlineSessions lost at least half of the stable members, possible
>>> split brain causing data inconsistency. Current members are
>>> [nodeagent16],
>>> lost members are [nodeagent15], stable members are [nodeagent15,
>>> nodeagent16]
>>> 2017-06-10 04:41:56,336 WARN  [org.infinispan.partitionhandl
>>> ing.impl.PreferAvailabilityStrategy]
>>> (transport-thread--p16-t3) [nodeagent16] KEYCLOAK 3.1.0-0.1 ISPN000314:
>>> Cache loginFailures lost at least half of the stable members, possible
>>> split brain causing data inconsistency. Current members are
>>> [nodeagent16],
>>> lost members are [nodeagent15], stable members are [nodeagent15,
>>> nodeagent16]
>>> 2017-06-10 04:41:56,509 INFO  [org.infinispan.remoting.trans
>>> port.jgroups.JGroupsTransport]
>>> (Incoming-2,ee,nodeagent16) [nodeagent16] KEYCLOAK 3.1.0-0.1 ISPN000094:
>>> Received new cluster view for channel web: [nodeagent16|10] (1)
>>> [nodeagent16]
>>> 2017-06-10 04:41:56,512 INFO  [org.infinispan.remoting.trans
>>> port.jgroups.JGroupsTransport]
>>> (Incoming-2,ee,nodeagent16) [nodeagent16] KEYCLOAK 3.1.0-0.1 ISPN000094:
>>> Received new cluster view for channel ejb: [nodeagent16|10] (1)
>>> [nodeagent16]
>>> 2017-06-10 04:41:56,513 INFO  [org.infinispan.remoting.trans
>>> port.jgroups.JGroupsTransport]
>>> (Incoming-2,ee,nodeagent16) [nodeagent16] KEYCLOAK 3.1.0-0.1 ISPN000094:
>>> Received new cluster view for channel hibernate: [nodeagent16|10] (1)
>>> [nodeagent16]
>>>
>>>
>>>
>>> On Fri, Jun 9, 2017 at 10:28 AM, Stian Thorgersen <sthorger at redhat.com>
>>> wrote:
>>>
>>> Your configuration is not correct and seems to be from an older version
>>>> of Keycloak. Please take a look at default standalone-ha.xml from 3.1
>>>> for
>>>> the correct cache configs.
>>>>
>>>> You also need to get cluster communication working properly. Make sure
>>>> the nodes see each other. When you start new nodes something should
>>>> happen
>>>> in the log in other nodes. In a cloud environment this can be tricky
>>>> (you
>>>> haven't said which one) as multicasting usually doesn't work and you
>>>> need
>>>> to use a different discovery protocol.
>>>>
>>>> On 7 June 2017 at 16:17, Jyoti Kumar Singh <jyoti.tech90 at gmail.com>
>>>> wrote:
>>>>
>>>> Hi Team,
>>>>>
>>>>> We are setting up keycloak:3.1.0.Final in a cluster mode for HA with
>>>>> full
>>>>> user sessions replication in a cloud system, i.e. when one node goes
>>>>> down
>>>>> then user will keep logged in on other node.
>>>>>
>>>>> I have setup cluster by using standalone-ha.xml and having infinispan
>>>>> cache
>>>>> as mentioned below:-
>>>>>
>>>>> <cache-container name="keycloak" jndi-name="infinispan/Keycloak">
>>>>>                  <transport lock-timeout="60000"/>
>>>>>                  <invalidation-cache name="realms" mode="SYNC"/>
>>>>>                  <invalidation-cache name="users" mode="SYNC"/>
>>>>>                  <distributed-cache name="sessions" mode="SYNC"
>>>>> owners="2"/>
>>>>>                  <distributed-cache name="loginFailures" mode="SYNC"
>>>>> owners="2"/>
>>>>> </cache-container>
>>>>>
>>>>> Every thing works fine except below use case:-
>>>>>
>>>>> 1. Node 1 and Node 2 both are up and user logged in - User session is
>>>>> getting generated by Node 1
>>>>> 2. Node 1 is now stopped and user session is getting replicated in Node
>>>>> 2 -
>>>>> User is still able to use the Keycloak console
>>>>> 3. Node 1 is up again and request is being transferred from LB to Node
>>>>> 1
>>>>> -
>>>>> User is asked to log in again because session cache is not replicated
>>>>> to
>>>>>      Node 1 immediately once it is up
>>>>>
>>>>> I saw one option to add *start="EAGER" *in cache-container to fix this
>>>>> but
>>>>> looks like with latest version of WildFly it is no longer supported. Do
>>>>> we
>>>>> have any other way to fix this issue ?
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> *With Regards, Jyoti Kumar Singh*
>>>>> _______________________________________________
>>>>> keycloak-user mailing list
>>>>> keycloak-user at lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/keycloak-user
>>>>>
>>>>>
>>>>
>>> --
>>>
>>> *With Regards, Jyoti Kumar Singh*
>>>
>>> _______________________________________________
>> keycloak-user mailing list
>> keycloak-user at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/keycloak-user
>>
>
>
>


-- 

*With Regards, Jyoti Kumar Singh*


More information about the keycloak-user mailing list