On Wed, Aug 29, 2018 at 3:27 PM Rafael Weingärtner <
rafaelweingartner(a)gmail.com> wrote:
I think I will need a little bit of your wisdom again.
I am now seeing the cluster between my Keycloak replicas to be created:
> ^[[0m^[[0m13:03:03,800 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
> thread 1-2) ISPN000079: Channel ejb local address is keycloak01, physical
> addresses are [192.168.1.58:55200]
> ^[[0m^[[0m13:03:03,801 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
> thread 1-1) ISPN000094: Received new cluster view for channel ejb:
> [keycloak02|1] (2) [keycloak02, keycloak01]
>
The problem is that when I shutdown one of them, a logged user will
receive the following message:
> An internal server error has occurred
>
Then, in the log files I see the following:
> ^[[0m^[[31m13:18:04,149 ERROR
> [org.infinispan.interceptors.InvocationContextInterceptor] (default
> task-24) ISPN000136: Error executing command GetKeyValueCommand, writing
> keys []: org.infinispan.util.concurrent.TimeoutException: Replication
> timeout
> at
>
org.infinispan.remoting.transport.jgroups.JGroupsTransport.lambda$invokeRemotelyAsync$1(JGroupsTransport.java:639)
> ^[[0m^[[31m13:18:15,262 ERROR
> [org.infinispan.interceptors.InvocationContextInterceptor]
> (expiration-thread--p22-t1) ISPN000136: Error executing command
> RemoveExpiredCommand, writing keys [468d1940-7293-4824-9e86-4aece6cd6744]:
> org.infinispan.util.concurrent.TimeoutException: Replication timeout for
> keycloak02
>
I see you just killed the node (e.g. kill -9 <pid>), so that it exited
without saying "goodbye". In that case JGroups FD_* protocols on the other
node need to do their work and discover the failure. If you have any
commands in flight, they might fail. I highly encourage you to use a larger
cluster (with odd number of nodes if possible). Having only two nodes can
be a bit dangerous. Imagine a partition split, after the split heals, which
node is right? Hard to tell...
I would say that this is expected as the node is down. However, it should
not be a problem for the whole system.
My replication settings are the following:
> <distributed-cache name="sessions" mode="SYNC"
owners="2"/>
> <distributed-cache name="authenticationSessions" mode="SYNC"
owners="2"/>
> <distributed-cache name="offlineSessions" mode="SYNC"
owners="2"/>
> <distributed-cache name="clientSessions" mode="SYNC"
owners="2"/>
> <distributed-cache name="offlineClientSessions" mode="SYNC"
owners="2"/>
> <distributed-cache name="loginFailures" mode="SYNC"
owners="2"/>
>
Do I need to change something else?
Here's the exactly the same problem. With number of owners=2 and 2 nodes,
this is essentially a replicated cache (despite some differences in logic).
I'd advice using at least 3 nodes (or even better 5).
On Wed, Aug 29, 2018 at 9:51 AM, Rafael Weingärtner <
rafaelweingartner(a)gmail.com> wrote:
> Ah no problem. It was my fault. I forgot to start debugging from the
> ground up (connectivity, firewalls, applications and so on )
>
> On Wed, Aug 29, 2018 at 9:49 AM, Bela Ban <bban(a)redhat.com> wrote:
>
>> Excellent! Unfortunately, JGroups cannot detect this...
>>
>> On 29/08/18 14:42, Rafael Weingärtner wrote:
>>
>>> Thanks!
>>> The problem was caused by firewalld blocking Multicast traffic.
>>>
>>> On Fri, Aug 24, 2018 at 7:28 AM, Sebastian Laskawiec <
>>> slaskawi(a)redhat.com <mailto:slaskawi@redhat.com>> wrote:
>>>
>>> Great write-up! Bookmarked!
>>>
>>> On Thu, Aug 23, 2018 at 4:36 PM Bela Ban <bban(a)redhat.com
>>> <mailto:bban@redhat.com>> wrote:
>>>
>>> Have you checked
>>>
>>>
https://github.com/belaban/workshop/blob/master/slides/admin.adoc#problem...
>>> <
>>>
https://github.com/belaban/workshop/blob/master/slides/admin.adoc#problem...
>>> >?
>>>
>>> On 23/08/18 13:53, Sebastian Laskawiec wrote:
>>> > +Bela Ban <mailto:bban@redhat.com
<mailto:bban@redhat.com>>
>>> >
>>> > As I expected, the cluster doesn't form.
>>> >
>>> > I'm not sure where and why those UDP discovery packets are
>>> rejected. I
>>> > just stumbled upon this thread [1], which you may find
>>> useful. Maybe
>>> > Bela will also have an idea what's going on there.
>>> >
>>> > If you won't manage to get UDP working, you can always
fall
>>> back into
>>> > TCP (and MPING).
>>> >
>>> > [1]
>>>
>>>
https://serverfault.com/questions/211482/tools-to-test-multicast-routing
>>> <
>>>
https://serverfault.com/questions/211482/tools-to-test-multicast-routing
>>> >
>>> >
>>> > On Thu, Aug 23, 2018 at 1:26 PM Rafael Weingärtner
>>> > <rafaelweingartner(a)gmail.com
>>> <mailto:rafaelweingartner@gmail.com>
>>> <mailto:rafaelweingartner@gmail.com
>>>
>>> <mailto:rafaelweingartner@gmail.com>>> wrote:
>>> >
>>> > Thanks for the reply Sebastian!
>>> >
>>> >
>>> > Note, that IP Multicasting is disabled in many data
>>> centers (I
>>> > have never found out why they do it, but I've seen
it
>>> many, many
>>> > times). So make sure your cluster forms correctly
>>> (just grep
>>> > logs and look for "view").
>>> >
>>> >
>>> > I thought about that. Then, I used tcpdump, and I can
>>> see the
>>> > multicast packets from both Keycloak replicas. However,
>>> it seems
>>> > that these packets are being ignored.
>>> >
>>> > root@Keycloak01:/# tcpdump -i eth0 port 7600 or port
>>> 55200 or
>>> > port 45700 or port 45688 or port 23364 or port 4712
>>> or port 4713
>>> > tcpdump: verbose output suppressed, use -v or -vv for
>>> full
>>> > protocol decode
>>> > listening on eth0, link-type EN10MB (Ethernet),
>>> capture size
>>> > 262144 bytes
>>> > 11:13:36.540080 IP keycloak02.local.55200 >
>>> 230.0.0.4.45688:
>>> > UDP, length 83
>>> > 11:13:41.288449 IP keycloak02.local.55200 >
>>> 230.0.0.4.45688:
>>> > UDP, length 83
>>> > 11:13:46.342606 IP keycloak02.local.55200 >
>>> 230.0.0.4.45688:
>>> > UDP, length 83
>>> >
>>> >
>>> > root@keycloak02:/# tcpdump -i eth0 port 7600 or port
>>> 55200 or
>>> > port 45700 or port 45688 or port 23364 or port 4712
>>> or port 4713
>>> > tcpdump: verbose output suppressed, use -v or -vv for
>>> full
>>> > protocol decode
>>> > listening on eth0, link-type EN10MB (Ethernet),
>>> capture size
>>> > 262144 bytes
>>> > 11:12:14.218317 IP Keycloak01.local.55200 >
>>> 230.0.0.4.45688:
>>> > UDP, length 83
>>> > 11:12:23.146798 IP Keycloak01.local.55200 >
>>> 230.0.0.4.45688:
>>> > UDP, length 83
>>> > 11:12:27.201888 IP Keycloak01.local.55200 >
>>> 230.0.0.4.45688:
>>> > UDP, length 83
>>> >
>>> >
>>> >
>>> > Here go the log entries. I filtered by “view”. This is
>>> from Keycloak01.
>>> >
>>> > ^[[0m^[[0m11:16:57,896 INFO
>>> >
>>> [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>>> > (MSC service thread 1-4) ISPN000094: Received new
>>> cluster view
>>> > for channel ejb: [keycloak01|0] (1) [keycloak01]
>>> > ^[[0m^[[0m11:16:57,896 INFO
>>> >
>>> [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>>> > (MSC service thread 1-2) ISPN000094: Received new
>>> cluster view
>>> > for channel ejb: [keycloak01|0] (1) [keycloak01]
>>> > ^[[0m^[[0m11:16:57,897 INFO
>>> >
>>> [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>>> > (MSC service thread 1-1) ISPN000094: Received new
>>> cluster view
>>> > for channel ejb: [keycloak01|0] (1) [keycloak01]
>>> > ^[[0m^[[0m11:16:57,898 INFO
>>> >
>>> [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>>> > (MSC service thread 1-3) ISPN000094: Received new
>>> cluster view
>>> > for channel ejb: [keycloak01|0] (1) [keycloak01]
>>> > ^[[0m^[[0m11:16:57,962 INFO
>>> >
>>> [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>>> > (MSC service thread 1-1) ISPN000094: Received new
>>> cluster view
>>> > for channel ejb: [keycloak01|0] (1) [keycloak01]
>>> >
>>> >
>>> > I expected it to be only one. I mean, I first started
>>> Keycloak01,
>>> > and just then Keycloak02. Next, we have the logs from
>>> Keycloak02.
>>> >
>>> > ^[[0m^[[0m11:17:34,950 INFO
>>> >
>>> [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>>> > (MSC service thread 1-3) ISPN000094: Received new
>>> cluster view
>>> > for channel ejb: [keycloak02|0] (1) [keycloak02]
>>> > ^[[0m^[[0m11:17:34,952 INFO
>>> >
>>> [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>>> > (MSC service thread 1-4) ISPN000094: Received new
>>> cluster view
>>> > for channel ejb: [keycloak02|0] (1) [keycloak02]
>>> > ^[[0m^[[0m11:17:34,957 INFO
>>> >
>>> [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>>> > (MSC service thread 1-1) ISPN000094: Received new
>>> cluster view
>>> > for channel ejb: [keycloak02|0] (1) [keycloak02]
>>> > ^[[0m^[[0m11:17:34,957 INFO
>>> >
>>> [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>>> > (MSC service thread 1-2) ISPN000094: Received new
>>> cluster view
>>> > for channel ejb: [keycloak02|0] (1) [keycloak02]
>>> > ^[[0m^[[0m11:17:35,052 INFO
>>> >
>>> [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>>> > (MSC service thread 1-1) ISPN000094: Received new
>>> cluster view
>>> > for channel ejb: [keycloak02|0] (1) [keycloak02
>>> >
>>> >
>>> > They are similar. It seems that both applications are not
>>> seeing
>>> > each other. At first, I thought that the problem was
>>> caused by
>>> > “owners=1” configuration (the lack of data
>>> synchronization between
>>> > replicas). I then changed it to “owners=2”, but still, if
>>> I log in
>>> > the Keycloak01 and then force my request to go two
>>> Keycloak02, my
>>> > session is not there, and I am requested to log in again.
>>> >
>>> > Do you need some other log entries or configuration
>>> files?
>>> >
>>> > Again, thanks for your reply and help!
>>> >
>>> > On Thu, Aug 23, 2018 at 5:24 AM, Sebastian Laskawiec
>>> > <slaskawi(a)redhat.com <mailto:slaskawi@redhat.com>
>>> <mailto:slaskawi@redhat.com
<mailto:slaskawi@redhat.com>>>
>>> wrote:
>>> >
>>> >
>>> >
>>> > On Wed, Aug 22, 2018 at 10:24 PM Rafael Weingärtner
>>> > <rafaelweingartner(a)gmail.com
>>> <mailto:rafaelweingartner@gmail.com>
>>> > <mailto:rafaelweingartner@gmail.com
>>>
>>> <mailto:rafaelweingartner@gmail.com>>> wrote:
>>> >
>>> > Hello Keycloakers,
>>> >
>>> > I have some doubts regarding Keycloak and load
>>> balancers. I
>>> > set up two
>>> > keycloak replicas to provide HA. To start them I
>>> am using
>>> > “./standalone.sh
>>> > --server-config=standalone-ha.xml”. I am
>>> assuming that they
>>> > will use
>>> > multicast to replicate information between nodes,
>>> right?
>>> >
>>> >
>>> > That is correct. It uses PING protocol, which in turn
>>> uses IP
>>> > Multicasting for discovery.
>>> >
>>> > Note, that IP Multicasting is disabled in many data
>>> centers (I
>>> > have never found out why they do it, but I've seen
it
>>> many, many
>>> > times). So make sure your cluster forms correctly
>>> (just grep
>>> > logs and look for "view").
>>> >
>>> > Then, I set up a load balancer layer using Apache
>>> HTTPD and
>>> > AJP connector
>>> > via 8009 port. To make everything work I needed
>>> to use
>>> > sticky session;
>>> > otherwise, the login would never happen. I am
>>> fine with the
>>> > sticky session,
>>> > however, if I stop one of the replicas where the
>>> user is
>>> > logged in, when
>>> > the user access Keycloak again, he/she is asked
>>> to present
>>> > the credentials
>>> > as if he/she was not logged in the other Keycloak
>>> replica.
>>> > Is that the
>>> > expected behavior?
>>> >
>>> >
>>> > My intuition tells me that your cluster didn't
form
>>> correctly
>>> > (as I mentioned before, grep the logs and look for
>>> "view"
>>> > generated by JGroups). Therefore, if you enable
>>> sticky session,
>>> > all your requests get to the same Keycloak instance,
>>> which has
>>> > everything in the local cache. That's why it works
>>> fine.
>>> >
>>> >
>>> > Is there some troubleshooting or test that I can
>>> perform to
>>> > check if
>>> > replication is being executed?
>>> >
>>> >
>>> > Let's start with investigating the logs. Later on
we
>>> can check JMX.
>>> >
>>> >
>>> > --
>>> > Rafael Weingärtner
>>> > _______________________________________________
>>> > keycloak-user mailing list
>>> > keycloak-user(a)lists.jboss.org
>>> <mailto:keycloak-user@lists.jboss.org>
>>> > <mailto:keycloak-user@lists.jboss.org
>>> <mailto:keycloak-user@lists.jboss.org>>
>>> >
https://lists.jboss.org/mailman/listinfo/keycloak-user
>>> <
https://lists.jboss.org/mailman/listinfo/keycloak-user>
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > Rafael Weingärtner
>>> >
>>>
>>> -- Bela Ban |
http://www.jgroups.org
>>>
>>>
>>>
>>>
>>> --
>>> Rafael Weingärtner
>>>
>>
>> --
>> Bela Ban |
http://www.jgroups.org
>>
>>
>
>
> --
> Rafael Weingärtner
>
--
Rafael Weingärtner