[keycloak-user] Does Keycloak need sticky session at the load balancer?

Thu Aug 30 07:23:25 EDT 2018

Thanks!

you guys helped me a lot!

On Thu, Aug 30, 2018 at 8:17 AM, Bela Ban <bban at redhat.com> wrote:

>
>
> On 30/08/18 13:02, Rafael Weingärtner wrote:
>
>> Awesome, thanks for the help, Sebastian. I have a question regarding
>> these "owners" numbers. What happens if I set this number to (let's say) 10
>> and I only spin up 7 nodes? Is it a valid deployment? And, will everything
>> work just fine? Or, would I start to get errors?
>>
>
> If numOwners is bigger than the number of members in the cluster, you
> essentially end up with full replication, where every data item is
> replicated to all members.
>
> IIRC, Infinispan even checks for this condition and automatically switches
> to multicasting rather than unicasting as long as the condition holds.
>
>
> On Thu, Aug 30, 2018 at 5:02 AM, Sebastian Laskawiec <slaskawi at redhat.com
>> <mailto:slaskawi at redhat.com>> wrote:
>>
>>     On Wed, Aug 29, 2018 at 3:27 PM Rafael Weingärtner
>>     <rafaelweingartner at gmail.com <mailto:rafaelweingartner at gmail.com>>
>>     wrote:
>>
>>         I think I will need a little bit of your wisdom again.
>>
>>         I am now seeing the cluster between my Keycloak replicas to be
>>         created:
>>
>>             ^[[0m^[[0m13:03:03,800 INFO
>>               [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>> (MSC service thread 1-2) ISPN000079: Channel ejb local address is
>> keycloak01, physical addresses are [192.168.1.58:55200 <
>> http://192.168.1.58:55200>]
>>
>>             ^[[0m^[[0m13:03:03,801 INFO
>>               [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>> (MSC service thread 1-1) ISPN000094: Received new cluster view for channel
>> ejb: [keycloak02|1] (2) [keycloak02, keycloak01]
>>
>>
>>         The problem is that when I shutdown one of them, a logged user
>>         will receive the following message:
>>
>>             An internal server error has occurred
>>
>>         Then, in the log files I see the following:
>>
>>             ^[[0m^[[31m13:18:04,149 ERROR
>>             [org.infinispan.interceptors.InvocationContextInterceptor]
>>             (default task-24) ISPN000136: Error executing command
>>             GetKeyValueCommand, writing keys []:
>>             org.infinispan.util.concurrent.TimeoutException: Replication
>>             timeout
>>                      at
>>             org.infinispan.remoting.transport.jgroups.JGroupsTransport.
>> lambda$invokeRemotelyAsync$1(JGroupsTransport.java:639)
>>             ^[[0m^[[31m13:18:15,262 ERROR
>>             [org.infinispan.interceptors.InvocationContextInterceptor]
>>             (expiration-thread--p22-t1) ISPN000136: Error executing
>>             command RemoveExpiredCommand, writing keys
>>             [468d1940-7293-4824-9e86-4aece6cd6744]:
>>             org.infinispan.util.concurrent.TimeoutException: Replication
>>             timeout for keycloak02
>>
>>
>>     I see you just killed the node (e.g. kill -9 <pid>), so that it
>>     exited without saying "goodbye". In that case JGroups FD_* protocols
>>     on the other node need to do their work and discover the failure. If
>>     you have any commands in flight, they might fail. I highly encourage
>>     you to use a larger cluster (with odd number of nodes if possible).
>>     Having only two nodes can be a bit dangerous. Imagine a partition
>>     split, after the split heals, which node is right? Hard to tell...
>>
>>
>>         I would say that this is expected as the node is down. However,
>>         it should not be a problem for the whole system.
>>
>>         My replication settings are the following:
>>
>>             <distributed-cache name="sessions" mode="SYNC" owners="2"/>
>>             <distributed-cache name="authenticationSessions" mode="SYNC"
>>             owners="2"/>
>>             <distributed-cache name="offlineSessions" mode="SYNC"
>>             owners="2"/>
>>             <distributed-cache name="clientSessions" mode="SYNC"
>>             owners="2"/>
>>             <distributed-cache name="offlineClientSessions" mode="SYNC"
>>             owners="2"/>
>>             <distributed-cache name="loginFailures" mode="SYNC"
>> owners="2"/>
>>
>>
>>         Do I need to change something else?
>>
>>     Here's the exactly the same problem. With number of owners=2 and 2
>>     nodes, this is essentially a replicated cache (despite some
>>     differences in logic). I'd advice using at least 3 nodes (or even
>>     better 5).
>>
>>
>>         On Wed, Aug 29, 2018 at 9:51 AM, Rafael Weingärtner
>>         <rafaelweingartner at gmail.com
>>         <mailto:rafaelweingartner at gmail.com>> wrote:
>>
>>             Ah no problem. It was my fault. I forgot to start debugging
>>             from the ground up  (connectivity, firewalls, applications
>>             and so on )
>>
>>             On Wed, Aug 29, 2018 at 9:49 AM, Bela Ban <bban at redhat.com
>>             <mailto:bban at redhat.com>> wrote:
>>
>>                 Excellent! Unfortunately, JGroups cannot detect this...
>>
>>                 On 29/08/18 14:42, Rafael Weingärtner wrote:
>>
>>                     Thanks!
>>                     The problem was caused by firewalld blocking
>>                     Multicast traffic.
>>
>>                     On Fri, Aug 24, 2018 at 7:28 AM, Sebastian Laskawiec
>>                     <slaskawi at redhat.com <mailto:slaskawi at redhat.com>
>>                     <mailto:slaskawi at redhat.com
>>                     <mailto:slaskawi at redhat.com>>> wrote:
>>
>>                          Great write-up! Bookmarked!
>>
>>                          On Thu, Aug 23, 2018 at 4:36 PM Bela Ban
>>                     <bban at redhat.com <mailto:bban at redhat.com>
>>                          <mailto:bban at redhat.com
>>                     <mailto:bban at redhat.com>>> wrote:
>>
>>                              Have you checked
>>                     https://github.com/belaban/wor
>> kshop/blob/master/slides/admin.adoc#problem-1-members-don-t-
>> find-each-other
>>                     <https://github.com/belaban/wo
>> rkshop/blob/master/slides/admin.adoc#problem-1-members-don-
>> t-find-each-other>
>>                                                 <
>> https://github.com/belaban/workshop/blob/master/slides/admi
>> n.adoc#problem-1-members-don-t-find-each-other
>>                     <https://github.com/belaban/wo
>> rkshop/blob/master/slides/admin.adoc#problem-1-members-don-
>> t-find-each-other>>?
>>
>>                              On 23/08/18 13:53, Sebastian Laskawiec wrote:
>>                               > +Bela Ban <mailto:bban at redhat.com
>>                     <mailto:bban at redhat.com> <mailto:bban at redhat.com
>>
>>                     <mailto:bban at redhat.com>>>
>>                               >
>>                               > As I expected, the cluster doesn't form.
>>                               >
>>                               > I'm not sure where and why those UDP
>>                     discovery packets are
>>                              rejected. I
>>                               > just stumbled upon this thread [1],
>>                     which you may find
>>                              useful. Maybe
>>                               > Bela will also have an idea what's going
>>                     on there.
>>                               >
>>                               > If you won't manage to get UDP working,
>>                     you can always fall
>>                              back into
>>                               > TCP (and MPING).
>>                               >
>>                               > [1]
>>                     https://serverfault.com/questi
>> ons/211482/tools-to-test-multicast-routing
>>                     <https://serverfault.com/quest
>> ions/211482/tools-to-test-multicast-routing>
>>                                                 <
>> https://serverfault.com/questions/211482/tools-to-test-multicast-routing
>>                     <https://serverfault.com/quest
>> ions/211482/tools-to-test-multicast-routing>>
>>                               >
>>                               > On Thu, Aug 23, 2018 at 1:26 PM Rafael
>>                     Weingärtner
>>                               > <rafaelweingartner at gmail.com
>>                     <mailto:rafaelweingartner at gmail.com>
>>                              <mailto:rafaelweingartner at gmail.com
>>                     <mailto:rafaelweingartner at gmail.com>>
>>                              <mailto:rafaelweingartner at gmail.com
>>                     <mailto:rafaelweingartner at gmail.com>
>>
>>                              <mailto:rafaelweingartner at gmail.com
>>                     <mailto:rafaelweingartner at gmail.com>>>> wrote:
>>                               >
>>                               >     Thanks for the reply Sebastian!
>>                               >
>>                               >
>>                               >         Note, that IP Multicasting is
>>                     disabled in many data
>>                              centers (I
>>                               >         have never found out why they do
>>                     it, but I've seen it
>>                              many, many
>>                               >         times). So make sure your
>>                     cluster forms correctly
>>                              (just grep
>>                               >         logs and look for "view").
>>                               >
>>                               >
>>                               >     I thought about that. Then, I used
>>                     tcpdump, and I can see the
>>                               >     multicast packets from both Keycloak
>>                     replicas. However,
>>                              it seems
>>                               >     that these packets are being ignored.
>>                               >
>>                               >         root at Keycloak01:/# tcpdump -i
>>                     eth0 port 7600 or port
>>                              55200 or
>>                               >         port 45700 or port 45688 or port
>>                     23364 or port 4712
>>                              or port 4713
>>                               >         tcpdump: verbose output
>>                     suppressed, use -v or -vv for
>>                              full
>>                               >         protocol decode
>>                               >         listening on eth0, link-type
>>                     EN10MB (Ethernet),
>>                              capture size
>>                               >         262144 bytes
>>                               >         11:13:36.540080 IP
>>                     keycloak02.local.55200 >
>>                              230.0.0.4.45688:
>>                               >         UDP, length 83
>>                               >         11:13:41.288449 IP
>>                     keycloak02.local.55200 >
>>                              230.0.0.4.45688:
>>                               >         UDP, length 83
>>                               >         11:13:46.342606 IP
>>                     keycloak02.local.55200 >
>>                              230.0.0.4.45688:
>>                               >         UDP, length 83
>>                               >
>>                               >
>>                               >         root at keycloak02:/# tcpdump -i
>>                     eth0 port 7600 or port
>>                              55200 or
>>                               >         port 45700 or port 45688 or port
>>                     23364 or port 4712
>>                              or port 4713
>>                               >         tcpdump: verbose output
>>                     suppressed, use -v or -vv for
>>                              full
>>                               >         protocol decode
>>                               >         listening on eth0, link-type
>>                     EN10MB (Ethernet),
>>                              capture size
>>                               >         262144 bytes
>>                               >         11:12:14.218317 IP
>>                     Keycloak01.local.55200 >
>>                              230.0.0.4.45688:
>>                               >         UDP, length 83
>>                               >         11:12:23.146798 IP
>>                     Keycloak01.local.55200 >
>>                              230.0.0.4.45688:
>>                               >         UDP, length 83
>>                               >         11:12:27.201888 IP
>>                     Keycloak01.local.55200 >
>>                              230.0.0.4.45688:
>>                               >         UDP, length 83
>>                               >
>>                               >
>>                               >
>>                               >     Here go the log entries. I filtered
>>                     by “view”. This is
>>                              from Keycloak01.
>>                               >
>>                               >         ^[[0m^[[0m11:16:57,896 INFO
>>                               >
>>  [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>>                               >         (MSC service thread 1-4)
>>                     ISPN000094: Received new
>>                              cluster view
>>                               >         for channel ejb: [keycloak01|0]
>>                     (1) [keycloak01]
>>                               >         ^[[0m^[[0m11:16:57,896 INFO
>>                               >
>>  [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>>                               >         (MSC service thread 1-2)
>>                     ISPN000094: Received new
>>                              cluster view
>>                               >         for channel ejb: [keycloak01|0]
>>                     (1) [keycloak01]
>>                               >         ^[[0m^[[0m11:16:57,897 INFO
>>                               >
>>  [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>>                               >         (MSC service thread 1-1)
>>                     ISPN000094: Received new
>>                              cluster view
>>                               >         for channel ejb: [keycloak01|0]
>>                     (1) [keycloak01]
>>                               >         ^[[0m^[[0m11:16:57,898 INFO
>>                               >
>>  [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>>                               >         (MSC service thread 1-3)
>>                     ISPN000094: Received new
>>                              cluster view
>>                               >         for channel ejb: [keycloak01|0]
>>                     (1) [keycloak01]
>>                               >         ^[[0m^[[0m11:16:57,962 INFO
>>                               >
>>  [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>>                               >         (MSC service thread 1-1)
>>                     ISPN000094: Received new
>>                              cluster view
>>                               >         for channel ejb: [keycloak01|0]
>>                     (1) [keycloak01]
>>                               >
>>                               >
>>                               >     I expected it to be only one.  I
>>                     mean, I first started
>>                              Keycloak01,
>>                               >     and just then Keycloak02. Next, we
>>                     have the logs from
>>                              Keycloak02.
>>                               >
>>                               >         ^[[0m^[[0m11:17:34,950 INFO
>>                               >
>>  [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>>                               >         (MSC service thread 1-3)
>>                     ISPN000094: Received new
>>                              cluster view
>>                               >         for channel ejb: [keycloak02|0]
>>                     (1) [keycloak02]
>>                               >         ^[[0m^[[0m11:17:34,952 INFO
>>                               >
>>  [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>>                               >         (MSC service thread 1-4)
>>                     ISPN000094: Received new
>>                              cluster view
>>                               >         for channel ejb: [keycloak02|0]
>>                     (1) [keycloak02]
>>                               >         ^[[0m^[[0m11:17:34,957 INFO
>>                               >
>>  [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>>                               >         (MSC service thread 1-1)
>>                     ISPN000094: Received new
>>                              cluster view
>>                               >         for channel ejb: [keycloak02|0]
>>                     (1) [keycloak02]
>>                               >         ^[[0m^[[0m11:17:34,957 INFO
>>                               >
>>  [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>>                               >         (MSC service thread 1-2)
>>                     ISPN000094: Received new
>>                              cluster view
>>                               >         for channel ejb: [keycloak02|0]
>>                     (1) [keycloak02]
>>                               >         ^[[0m^[[0m11:17:35,052 INFO
>>                               >
>>  [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>>                               >         (MSC service thread 1-1)
>>                     ISPN000094: Received new
>>                              cluster view
>>                               >         for channel ejb: [keycloak02|0]
>>                     (1) [keycloak02
>>                               >
>>                               >
>>                               >     They are similar. It seems that both
>>                     applications are not
>>                              seeing
>>                               >     each other. At first, I thought that
>>                     the problem was
>>                              caused by
>>                               >     “owners=1” configuration (the lack
>>                     of data
>>                              synchronization between
>>                               >     replicas). I then changed it to
>>                     “owners=2”, but still, if
>>                              I log in
>>                               >     the Keycloak01 and then force my
>>                     request to go two
>>                              Keycloak02, my
>>                               >     session is not there, and I am
>>                     requested to log in again.
>>                               >
>>                               >     Do you need some other log entries
>>                     or configuration files?
>>                               >
>>                               >     Again, thanks for your reply and help!
>>                               >
>>                               >     On Thu, Aug 23, 2018 at 5:24 AM,
>>                     Sebastian Laskawiec
>>                               >     <slaskawi at redhat.com
>>                     <mailto:slaskawi at redhat.com>
>>                     <mailto:slaskawi at redhat.com
>>                     <mailto:slaskawi at redhat.com>>
>>                              <mailto:slaskawi at redhat.com
>>                     <mailto:slaskawi at redhat.com>
>>                     <mailto:slaskawi at redhat.com
>>                     <mailto:slaskawi at redhat.com>>>> wrote:
>>                               >
>>                               >
>>                               >
>>                               >         On Wed, Aug 22, 2018 at 10:24 PM
>>                     Rafael Weingärtner
>>                               >         <rafaelweingartner at gmail.com
>>                     <mailto:rafaelweingartner at gmail.com>
>>                              <mailto:rafaelweingartner at gmail.com
>>                     <mailto:rafaelweingartner at gmail.com>>
>>                               >                             <mailto:
>> rafaelweingartner at gmail.com
>>                     <mailto:rafaelweingartner at gmail.com>
>>
>>                              <mailto:rafaelweingartner at gmail.com
>>                     <mailto:rafaelweingartner at gmail.com>>>> wrote:
>>                               >
>>                               >             Hello Keycloakers,
>>                               >
>>                               >             I have some doubts regarding
>>                     Keycloak and load
>>                              balancers. I
>>                               >             set up two
>>                               >             keycloak replicas to provide
>>                     HA. To start them I
>>                              am using
>>                               >             “./standalone.sh
>>                               >
>>  --server-config=standalone-ha.xml”.  I am
>>                              assuming that they
>>                               >             will use
>>                               >             multicast to replicate
>>                     information between nodes,
>>                              right?
>>                               >
>>                               >
>>                               >         That is correct. It uses PING
>>                     protocol, which in turn
>>                              uses IP
>>                               >         Multicasting for discovery.
>>                               >
>>                               >         Note, that IP Multicasting is
>>                     disabled in many data
>>                              centers (I
>>                               >         have never found out why they do
>>                     it, but I've seen it
>>                              many, many
>>                               >         times). So make sure your
>>                     cluster forms correctly
>>                              (just grep
>>                               >         logs and look for "view").
>>                               >
>>                               >             Then, I set up a load
>>                     balancer layer using Apache
>>                              HTTPD and
>>                               >             AJP connector
>>                               >             via 8009 port. To make
>>                     everything work I needed
>>                              to use
>>                               >             sticky session;
>>                               >             otherwise, the login would
>>                     never happen. I am
>>                              fine with the
>>                               >             sticky session,
>>                               >             however, if I stop one of
>>                     the replicas where the
>>                              user is
>>                               >             logged in, when
>>                               >             the user access Keycloak
>>                     again, he/she is asked
>>                              to present
>>                               >             the credentials
>>                               >             as if he/she was not logged
>>                     in the other Keycloak
>>                              replica.
>>                               >             Is that the
>>                               >             expected behavior?
>>                               >
>>                               >
>>                               >         My intuition tells me that your
>>                     cluster didn't form
>>                              correctly
>>                               >         (as I mentioned before, grep the
>>                     logs and look for "view"
>>                               >         generated by JGroups).
>>                     Therefore, if you enable
>>                              sticky session,
>>                               >         all your requests get to the
>>                     same Keycloak instance,
>>                              which has
>>                               >         everything in the local cache.
>>                     That's why it works fine.
>>                               >
>>                               >
>>                               >             Is there some
>>                     troubleshooting or test that I can
>>                              perform to
>>                               >             check if
>>                               >             replication is being executed?
>>                               >
>>                               >
>>                               >         Let's start with investigating
>>                     the logs. Later on we
>>                              can check JMX.
>>                               >
>>                               >
>>                               >             --
>>                               >             Rafael Weingärtner
>>                               >
>>  _______________________________________________
>>                               >             keycloak-user mailing list
>>                               > keycloak-user at lists.jboss.org
>>                     <mailto:keycloak-user at lists.jboss.org>
>>                              <mailto:keycloak-user at lists.jboss.org
>>                     <mailto:keycloak-user at lists.jboss.org>>
>>                               >                                 <mailto:
>> keycloak-user at lists.jboss.org
>>                     <mailto:keycloak-user at lists.jboss.org>
>>                              <mailto:keycloak-user at lists.jboss.org
>>                     <mailto:keycloak-user at lists.jboss.org>>>
>>                               >
>>                     https://lists.jboss.org/mailma
>> n/listinfo/keycloak-user
>>                     <https://lists.jboss.org/mailm
>> an/listinfo/keycloak-user>
>>                                                 <
>> https://lists.jboss.org/mailman/listinfo/keycloak-user
>>                     <https://lists.jboss.org/mailm
>> an/listinfo/keycloak-user>>
>>                               >
>>                               >
>>                               >
>>                               >
>>                               >     --
>>                               >     Rafael Weingärtner
>>                               >
>>
>>                              --         Bela Ban | http://www.jgroups.org
>>
>>
>>
>>
>>                     --                     Rafael Weingärtner
>>
>>
>>                 --                 Bela Ban | http://www.jgroups.org
>>
>>
>>
>>
>>             --             Rafael Weingärtner
>>
>>
>>
>>
>>         --         Rafael Weingärtner
>>
>>
>>
>>
>> --
>> Rafael Weingärtner
>>
>
> --
> Bela Ban | http://www.jgroups.org
>
>

-- 
Rafael Weingärtner