If you want to build an automated script to detect such situations and try
to fix them, you would need to:
- Identify that cluster experiences some problems by scanning the logs.
- Identify what node is a JGroups coordinator. You may do this by examining
JMX and looking for: jgroups/protocol/ee/GMS and the attribute name is coord
- Kill the node that is a coordinator. Then the cluster should elect the
new coordinator and should reconcile.
However, the proper recommendation is to do the upgrade and get the proper
fix from JGroups.
On Tue, May 14, 2019 at 4:03 PM <pkboucher801(a)gmail.com> wrote:
In the meantime (before we switch to 5.0+), is there any way to
automate
recognition when the cluster hangs (or is about to hang) as in
https://issues.jboss.org/browse/WFLY-10736?attachmentViewMode=list
<
https://secure-web.cisco.com/1rkzdmR023GzmidXPHjPkNYaDpKjjdqm_1Dg-0EGDGbi...
?
And is it reliable to un-hang the cluster by scaling it down to zero
instances, and then scaling back up?
Thanks!
Regards,
Peter
From: *Sebastian Laskawiec* <slaskawi(a)redhat.com>
Date: Fri, Apr 26, 2019 at 6:11 PM
Subject: Re: [keycloak-dev] HA mode with JDBC_PING shows warning in the
logs after migration to 4.8.3 from 3.4.3
To: abhishek raghav <abhi.raghav007(a)gmail.com>
Cc: keycloak-user <keycloak-user(a)lists.jboss.org>, keycloak-dev <
keycloak-dev(a)lists.jboss.org>
There was a bunch of fixed to JGroups a while ago, including changes in
JDBC_PING.
Could you please rerun your setup with Keycloak >= 5.0.0? I believe some
of the issues (or maybe even all of them) should be fixed.
On Thu, Apr 25, 2019 at 7:19 PM abhishek raghav <abhi.raghav007(a)gmail.com>
wrote:
Hi
After the migration of keycloak HA configurations from 3.4.3.Final to
4.8.3.Final, I am seeing some WARNINGS on one of the nodes of keycloak
immediately after the keycloak is started with 2 nodes. This occurs after
every time when the cluster is scaled up or whenever infinispan is trying
to update the cluster member list.
I am using JDBC_PING to achieve clustering in keycloak.
Below is the stacktrace -
2019-04-24 12:20:43,687 WARN
>> [org.infinispan.topology.ClusterTopologyManagerImpl]
>> (transport-thread--p18-t2) [dcidqdcosagent08] KEYCLOAK DEV 1.5.RC
>> ISPN000197: Error updating cluster member list:
>> org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out
>> waiting for responses for request 1 from dcidqdcosagent02
>
> at
>>
org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
>
> at
>>
org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
>
> at
>>
org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
>
> at
>> java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
>>
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>
> at
>>
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>
> at
>>
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>
> at
>>
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>
> at java.lang.Thread.run(Thread.java:748)
>
> Suppressed: org.infinispan.util.logging.TraceException
>
> at
>>
org.infinispan.remoting.transport.Transport.invokeRemotely(Transport.java:75)
>
> at
>>
org.infinispan.topology.ClusterTopologyManagerImpl.confirmMembersAvailable(ClusterTopologyManagerImpl.java:525)
>
> at
>>
org.infinispan.topology.ClusterTopologyManagerImpl.updateCacheMembers(ClusterTopologyManagerImpl.java:508)
>
>
Now after I searched, I really did not see anyone reported such error on
keycloak but there is similar bug reported in WILDLFY 14 and is categorized
as a blocker in WILDLFY 14.This bug is already fixed in WILDLFY 15.
https://issues.jboss.org/browse/WFLY-10736?attachmentViewMode=list
<
https://secure-web.cisco.com/1rkzdmR023GzmidXPHjPkNYaDpKjjdqm_1Dg-0EGDGbi...
Now since keycloak 4.8 is also based on WILDLFY 14, these WARNINGS could be
because of this blocker in WILDFLY 14.
What should I do to get rid this error. Is this really a problem in
keycloak 4.8.3.Final. Did anyone notice any such issue while running
keycloak 4.8.3 in HA mode.
Is there a workaround to fix this.
One more thing we noticed is - It is regarding a property in JDBC_PING
protocol we are using in our 3.4.3 setup i.e. "clear_table_on_view_change"
but it is no more supported in 4.8 version. and thus the JGROUPSPING table
is filled up with lot of stale entries. Is there a workaround to clear the
table after view change in 4.8 also.
Thanks
Abhishek
_______________________________________________
keycloak-dev mailing list
keycloak-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/keycloak-dev
<
https://secure-web.cisco.com/17Y97-5o4x23f7qOJVpqrPG_ucegGkNCivunRvLWIYKM...