[keycloak-user] TimeoutException: Replication timeout

Sebastian Laskawiec slaskawi at redhat.com
Fri Aug 31 07:45:29 EDT 2018


Hey guys,

The replication timeout occurs when Infinispan replicates data to the other
nodes (via so called commands) and the target node doesn't respond within a
certain timeout.

There might be different things that cause it, including:
- Long GC pauses (longer than replication timeout setting)
- Slow network
- High CPU constraints on the target node
- Wrong JGroups settings
- Target node crashes
- and even more

My suggestion is to try the following:
- increase remote timeout [1]
- Turn on GC logging and see if there's any correlation between the
replication errors and GC pauses
- Make sure you're using default JGroups configuration. If anything needs
adjusting, please adjust one thing at a time.

Please let me know how did it go.

Thanks,
Sebastian

[1] here's a configuration example
https://github.com/infinispan/infinispan/blob/master/server/integration/infinispan/src/test/resources/org/jboss/as/clustering/infinispan/subsystem/subsystem-infinispan_9_0.xml#L131

On Fri, Aug 31, 2018 at 12:30 PM Piotr Wojcieszonek <
p.wojcieszonek at bosmanager.com> wrote:

> No, I don’t see any errors beside this mentioned earlier.
> Stack trace in log file, and it’s look that if cluster work for some time
> then it slow down dramatically in all operation that require write some
> data like creating realm.
>
>
> > Wiadomość napisana przez Rafael Weingärtner <rafaelweingartner at gmail.com>
> w dniu 31.08.2018, o godz. 12:21:
> >
> > hmmm...Besides the stacktrace, did you see any other warning or error?
> >
> > On Fri, Aug 31, 2018 at 7:08 AM, Piotr Wojcieszonek <
> p.wojcieszonek at bosmanager.com <mailto:p.wojcieszonek at bosmanager.com>>
> wrote:
> > This is Ubuntu 16.04 without SELinux.
> >
> >
> >> Wiadomość napisana przez Rafael Weingärtner <
> rafaelweingartner at gmail.com <mailto:rafaelweingartner at gmail.com>> w dniu
> 31.08.2018, o godz. 12:06:
> >>
> >> Then, there is only one things left in my mind. Did you configure
> SElinux?
> >>
> >> On Fri, Aug 31, 2018 at 7:04 AM, Piotr Wojcieszonek <
> p.wojcieszonek at bosmanager.com <mailto:p.wojcieszonek at bosmanager.com>>
> wrote:
> >> No, there is no firewall on host and both of them are in the same vlan.
> >>
> >>
> >>> Wiadomość napisana przez Rafael Weingärtner <
> rafaelweingartner at gmail.com <mailto:rafaelweingartner at gmail.com>> w dniu
> 31.08.2018, o godz. 12:00:
> >>>
> >>> Your cluster is being created. Everything seems fine there. I had a
> problem once when the cluster was not being formed due to firewall problems.
> >>> Is your firewall activated?
> >>>
> >>> On Fri, Aug 31, 2018 at 6:56 AM, Piotr Wojcieszonek <
> p.wojcieszonek at bosmanager.com <mailto:p.wojcieszonek at bosmanager.com>>
> wrote:
> >>>
> >>> 2018-08-30 00:07:43,048 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
> thread 1-2) ISPN000094: Received new cluster view for channel ejb:
> [authservice1|0] (1) [authservice1]
> >>> 2018-08-30 00:07:43,049 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
> thread 1-5) ISPN000094: Received new cluster view for channel ejb:
> [authservice1|0] (1) [authservice1]
> >>> 2018-08-30 00:07:43,048 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
> thread 1-3) ISPN000094: Received new cluster view for channel ejb:
> [authservice1|0] (1) [authservice1]
> >>> 2018-08-30 00:07:43,049 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
> thread 1-7) ISPN000094: Received new cluster view for channel ejb:
> [authservice1|0] (1) [authservice1]
> >>> 2018-08-30 00:07:43,048 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
> thread 1-6) ISPN000094: Received new cluster view for channel ejb:
> [authservice1|0] (1) [authservice1]
> >>> 2018-08-30 00:07:45,757 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-2)
> ISPN000094: Received new cluster view for channel ejb: [authservice1|1] (2)
> [authservice1, authservice2]
> >>> 2018-08-30 00:07:45,758 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-2)
> ISPN000094: Received new cluster view for channel ejb: [authservice1|1] (2)
> [authservice1, authservice2]
> >>> 2018-08-30 00:07:45,763 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-2)
> ISPN000094: Received new cluster view for channel ejb: [authservice1|1] (2)
> [authservice1, authservice2]
> >>> 2018-08-30 00:07:45,779 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-2)
> ISPN000094: Received new cluster view for channel ejb: [authservice1|1] (2)
> [authservice1, authservice2]
> >>> 2018-08-30 00:07:45,783 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-2)
> ISPN000094: Received new cluster view for channel ejb: [authservice1|1] (2)
> [authservice1, authservice2]
> >>> 2018-08-30 00:39:50,071 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
> thread 1-4) ISPN000094: Received new cluster view for channel ejb:
> [authservice2|1] (2) [authservice2, authservice1]
> >>> 2018-08-30 00:39:50,071 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
> thread 1-2) ISPN000094: Received new cluster view for channel ejb:
> [authservice2|1] (2) [authservice2, authservice1]
> >>> 2018-08-30 00:39:50,071 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
> thread 1-1) ISPN000094: Received new cluster view for channel ejb:
> [authservice2|1] (2) [authservice2, authservice1]
> >>> 2018-08-30 00:39:50,071 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
> thread 1-6) ISPN000094: Received new cluster view for channel ejb:
> [authservice2|1] (2) [authservice2, authservice1]
> >>> 2018-08-30 00:39:50,078 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
> thread 1-5) ISPN000094: Received new cluster view for channel ejb:
> [authservice2|1] (2) [authservice2, authservice1]
> >>> 2018-08-30 19:41:38,553 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-2)
> ISPN000094: Received new cluster view for channel ejb: [authservice1|2] (1)
> [authservice1]
> >>> 2018-08-30 19:41:43,562 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
> thread 1-3) ISPN000094: Received new cluster view for channel ejb:
> [authservice2|1] (2) [authservice2, authservice1]
> >>> 2018-08-30 19:41:43,563 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
> thread 1-5) ISPN000094: Received new cluster view for channel ejb:
> [authservice2|1] (2) [authservice2, authservice1]
> >>> 2018-08-30 19:41:43,563 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
> thread 1-7) ISPN000094: Received new cluster view for channel ejb:
> [authservice2|1] (2) [authservice2, authservice1]
> >>> 2018-08-30 19:41:43,563 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
> thread 1-2) ISPN000094: Received new cluster view for channel ejb:
> [authservice2|1] (2) [authservice2, authservice1]
> >>> 2018-08-30 19:41:43,563 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
> thread 1-1) ISPN000094: Received new cluster view for channel ejb:
> [authservice2|1] (2) [authservice2, authservice1]
> >>> 2018-08-30 21:16:23,261 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-2)
> ISPN000094: Received new cluster view for channel ejb: [authservice1|2] (1)
> [authservice1]
> >>> 2018-08-30 21:16:23,264 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-2)
> ISPN000094: Received new cluster view for channel ejb: [authservice1|2] (1)
> [authservice1]
> >>> 2018-08-30 21:16:23,330 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-2)
> ISPN000094: Received new cluster view for channel ejb: [authservice1|2] (1)
> [authservice1]
> >>> 2018-08-30 21:16:23,330 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-2)
> ISPN000094: Received new cluster view for channel ejb: [authservice1|2] (1)
> [authservice1]
> >>> 2018-08-30 21:16:23,344 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-2)
> ISPN000094: Received new cluster view for channel ejb: [authservice1|2] (1)
> [authservice1]
> >>> 2018-08-30 21:17:36,994 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
> thread 1-2) ISPN000094: Received new cluster view for channel ejb:
> [authservice2|1] (2) [authservice2, authservice1]
> >>> 2018-08-30 21:17:36,995 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
> thread 1-4) ISPN000094: Received new cluster view for channel ejb:
> [authservice2|1] (2) [authservice2, authservice1]
> >>> 2018-08-30 21:17:36,994 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
> thread 1-1) ISPN000094: Received new cluster view for channel ejb:
> [authservice2|1] (2) [authservice2, authservice1]
> >>> 2018-08-30 21:17:36,995 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
> thread 1-6) ISPN000094: Received new cluster view for channel ejb:
> [authservice2|1] (2) [authservice2, authservice1]
> >>> 2018-08-30 21:17:36,995 INFO
> [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
> thread 1-3) ISPN000094: Received new cluster view for channel ejb:
> [authservice2|1] (2) [authservice2, authservice1]
> >>>
> >>>
> >>>> Wiadomość napisana przez Rafael Weingärtner <
> rafaelweingartner at gmail.com <mailto:rafaelweingartner at gmail.com>> w dniu
> 31.08.2018, o godz. 11:52:
> >>>>
> >>>> Can you check if your cluster if being formed? You can grep the word
> "view" in Keycloak's logs.
> >>>>
> >>>> On Fri, Aug 31, 2018 at 3:16 AM, Piotr Wojcieszonek <
> p.wojcieszonek at bosmanager.com <mailto:p.wojcieszonek at bosmanager.com>>
> wrote:
> >>>> Yes I did:
> >>>>                 <distributed-cache name="sessions" mode="SYNC"
> owners="2"/>
> >>>>                 <distributed-cache name="authenticationSessions"
> mode="SYNC" owners="2"/>
> >>>>                 <distributed-cache name="offlineSessions" mode="SYNC"
> owners="2"/>
> >>>>                 <distributed-cache name="clientSessions" mode="SYNC"
> owners="2"/>
> >>>>                 <distributed-cache name="offlineClientSessions"
> mode="SYNC" owners="2"/>
> >>>>                 <distributed-cache name="loginFailures" mode="SYNC"
> owners="2”/>
> >>>>
> >>>> I have two node cluster.
> >>>>
> >>>>
> >>>>
> >>>>> Wiadomość napisana przez Rafael Weingärtner <
> rafaelweingartner at gmail.com <mailto:rafaelweingartner at gmail.com>> w dniu
> 31.08.2018, o godz. 03:01:
> >>>>>
> >>>>> Did you configure the "owners" parameter in the standalone-ha.xml
> with a value bigger than 1?
> >>>>>
> >>>>> On Thu, Aug 30, 2018 at 5:06 PM, Piotr Wojcieszonek <
> p.wojcieszonek at bosmanager.com <mailto:p.wojcieszonek at bosmanager.com>>
> wrote:
> >>>>> Hi,
> >>>>> I have problem with standalone cluster. After cluster start working
> it begin to print that kind of information to log file:
> >>>>>
> >>>>> 2018-08-30 22:04:33,725 WARN
> [org.infinispan.expiration.impl.ClusterExpirationManager]
> (expiration-thread--p22-t1) ISPN000026: Caught exception purging data
> container!: org.infinispan.util.concurrent.TimeoutException: Replication
> timeout for authservice1
> >>>>>         at
> org.infinispan.remoting.transport.jgroups.JGroupsTransport.checkRsp(JGroupsTransport.java:827)
> >>>>>         at
> org.infinispan.remoting.transport.jgroups.JGroupsTransport.lambda$invokeRemotelyAsync$0(JGroupsTransport.java:628)
> >>>>>         at
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
> >>>>>         at
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
> >>>>>         at
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> >>>>>         at
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
> >>>>>         at
> org.infinispan.remoting.transport.jgroups.SingleResponseFuture.call(SingleResponseFuture.java:46)
> >>>>>         at
> org.infinispan.remoting.transport.jgroups.SingleResponseFuture.call(SingleResponseFuture.java:17)
> >>>>>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> >>>>>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> >>>>>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> >>>>>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> >>>>>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> >>>>>         at java.lang.Thread.run(Thread.java:748)
> >>>>>
> >>>>> 2018-08-30 22:04:39,966 ERROR
> [org.infinispan.interceptors.InvocationContextInterceptor] (Timer-2)
> ISPN000136: Error executing command PutKeyValueCommand, writing keys
> [task::ClearExpiredClientInitialAccessTokens]:
> org.infinispan.util.concurrent.TimeoutException: Replication timeout for
> authservice1
> >>>>>         at
> org.infinispan.remoting.transport.jgroups.JGroupsTransport.checkRsp(JGroupsTransport.java:827)
> >>>>>         at
> org.infinispan.remoting.transport.jgroups.JGroupsTransport.lambda$invokeRemotelyAsync$0(JGroupsTransport.java:628)
> >>>>>         at
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
> >>>>>         at
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
> >>>>>         at
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> >>>>>         at
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
> >>>>>         at
> org.infinispan.remoting.transport.jgroups.SingleResponseFuture.call(SingleResponseFuture.java:46)
> >>>>>         at
> org.infinispan.remoting.transport.jgroups.SingleResponseFuture.call(SingleResponseFuture.java:17)
> >>>>>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> >>>>>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> >>>>>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> >>>>>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> >>>>>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> >>>>>         at java.lang.Thread.run(Thread.java:748)
> >>>>>
> >>>>> 2018-08-30 22:04:52,605 ERROR
> [org.infinispan.interceptors.InvocationContextInterceptor] (Timer-2)
> ISPN000136: Error executing command PutKeyValueCommand, writing keys
> [task::ClearExpiredClientInitialAccessTokens]:
> org.infinispan.util.concurrent.TimeoutException: Replication timeout for
> authservice1
> >>>>>         at
> org.infinispan.remoting.transport.jgroups.JGroupsTransport.checkRsp(JGroupsTransport.java:827)
> >>>>>         at
> org.infinispan.remoting.transport.jgroups.JGroupsTransport.lambda$invokeRemotelyAsync$0(JGroupsTransport.java:628)
> >>>>>         at
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
> >>>>>         at
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
> >>>>>         at
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> >>>>>         at
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
> >>>>>         at
> org.infinispan.remoting.transport.jgroups.SingleResponseFuture.call(SingleResponseFuture.java:46)
> >>>>>         at
> org.infinispan.remoting.transport.jgroups.SingleResponseFuture.call(SingleResponseFuture.java:17)
> >>>>>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> >>>>>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> >>>>>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> >>>>>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> >>>>>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> >>>>>         at java.lang.Thread.run(Thread.java:748)
> >>>>>
> >>>>> And in time it start to work very slow.
> >>>>> What can by a reason for that?
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> keycloak-user mailing list
> >>>>> keycloak-user at lists.jboss.org <mailto:keycloak-user at lists.jboss.org>
> >>>>> https://lists.jboss.org/mailman/listinfo/keycloak-user <
> https://lists.jboss.org/mailman/listinfo/keycloak-user>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Rafael Weingärtner
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Rafael Weingärtner
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> Rafael Weingärtner
> >>
> >>
> >>
> >>
> >> --
> >> Rafael Weingärtner
> >
> >
> >
> >
> > --
> > Rafael Weingärtner
>
> _______________________________________________
> keycloak-user mailing list
> keycloak-user at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/keycloak-user


More information about the keycloak-user mailing list