Hey guys,
The replication timeout occurs when Infinispan replicates data to the other
nodes (via so called commands) and the target node doesn't respond within a
certain timeout.
There might be different things that cause it, including:
- Long GC pauses (longer than replication timeout setting)
- Slow network
- High CPU constraints on the target node
- Wrong JGroups settings
- Target node crashes
- and even more
My suggestion is to try the following:
- increase remote timeout [1]
- Turn on GC logging and see if there's any correlation between the
replication errors and GC pauses
- Make sure you're using default JGroups configuration. If anything needs
adjusting, please adjust one thing at a time.
Please let me know how did it go.
Thanks,
Sebastian
[1] here's a configuration example
On Fri, Aug 31, 2018 at 12:30 PM Piotr Wojcieszonek <
p.wojcieszonek(a)bosmanager.com> wrote:
No, I don’t see any errors beside this mentioned earlier.
Stack trace in log file, and it’s look that if cluster work for some time
then it slow down dramatically in all operation that require write some
data like creating realm.
> Wiadomość napisana przez Rafael Weingärtner <rafaelweingartner(a)gmail.com>
w dniu 31.08.2018, o godz. 12:21:
>
> hmmm...Besides the stacktrace, did you see any other warning or error?
>
> On Fri, Aug 31, 2018 at 7:08 AM, Piotr Wojcieszonek <
p.wojcieszonek(a)bosmanager.com <mailto:p.wojcieszonek@bosmanager.com>>
wrote:
> This is Ubuntu 16.04 without SELinux.
>
>
>> Wiadomość napisana przez Rafael Weingärtner <
rafaelweingartner(a)gmail.com <mailto:rafaelweingartner@gmail.com>> w dniu
31.08.2018, o godz. 12:06:
>>
>> Then, there is only one things left in my mind. Did you configure
SElinux?
>>
>> On Fri, Aug 31, 2018 at 7:04 AM, Piotr Wojcieszonek <
p.wojcieszonek(a)bosmanager.com <mailto:p.wojcieszonek@bosmanager.com>>
wrote:
>> No, there is no firewall on host and both of them are in the same vlan.
>>
>>
>>> Wiadomość napisana przez Rafael Weingärtner <
rafaelweingartner(a)gmail.com <mailto:rafaelweingartner@gmail.com>> w dniu
31.08.2018, o godz. 12:00:
>>>
>>> Your cluster is being created. Everything seems fine there. I had a
problem once when the cluster was not being formed due to firewall problems.
>>> Is your firewall activated?
>>>
>>> On Fri, Aug 31, 2018 at 6:56 AM, Piotr Wojcieszonek <
p.wojcieszonek(a)bosmanager.com <mailto:p.wojcieszonek@bosmanager.com>>
wrote:
>>>
>>> 2018-08-30 00:07:43,048 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-2) ISPN000094: Received new cluster view for channel ejb:
[authservice1|0] (1) [authservice1]
>>> 2018-08-30 00:07:43,049 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-5) ISPN000094: Received new cluster view for channel ejb:
[authservice1|0] (1) [authservice1]
>>> 2018-08-30 00:07:43,048 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-3) ISPN000094: Received new cluster view for channel ejb:
[authservice1|0] (1) [authservice1]
>>> 2018-08-30 00:07:43,049 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-7) ISPN000094: Received new cluster view for channel ejb:
[authservice1|0] (1) [authservice1]
>>> 2018-08-30 00:07:43,048 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-6) ISPN000094: Received new cluster view for channel ejb:
[authservice1|0] (1) [authservice1]
>>> 2018-08-30 00:07:45,757 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-2)
ISPN000094: Received new cluster view for channel ejb: [authservice1|1] (2)
[authservice1, authservice2]
>>> 2018-08-30 00:07:45,758 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-2)
ISPN000094: Received new cluster view for channel ejb: [authservice1|1] (2)
[authservice1, authservice2]
>>> 2018-08-30 00:07:45,763 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-2)
ISPN000094: Received new cluster view for channel ejb: [authservice1|1] (2)
[authservice1, authservice2]
>>> 2018-08-30 00:07:45,779 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-2)
ISPN000094: Received new cluster view for channel ejb: [authservice1|1] (2)
[authservice1, authservice2]
>>> 2018-08-30 00:07:45,783 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-2)
ISPN000094: Received new cluster view for channel ejb: [authservice1|1] (2)
[authservice1, authservice2]
>>> 2018-08-30 00:39:50,071 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-4) ISPN000094: Received new cluster view for channel ejb:
[authservice2|1] (2) [authservice2, authservice1]
>>> 2018-08-30 00:39:50,071 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-2) ISPN000094: Received new cluster view for channel ejb:
[authservice2|1] (2) [authservice2, authservice1]
>>> 2018-08-30 00:39:50,071 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-1) ISPN000094: Received new cluster view for channel ejb:
[authservice2|1] (2) [authservice2, authservice1]
>>> 2018-08-30 00:39:50,071 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-6) ISPN000094: Received new cluster view for channel ejb:
[authservice2|1] (2) [authservice2, authservice1]
>>> 2018-08-30 00:39:50,078 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-5) ISPN000094: Received new cluster view for channel ejb:
[authservice2|1] (2) [authservice2, authservice1]
>>> 2018-08-30 19:41:38,553 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-2)
ISPN000094: Received new cluster view for channel ejb: [authservice1|2] (1)
[authservice1]
>>> 2018-08-30 19:41:43,562 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-3) ISPN000094: Received new cluster view for channel ejb:
[authservice2|1] (2) [authservice2, authservice1]
>>> 2018-08-30 19:41:43,563 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-5) ISPN000094: Received new cluster view for channel ejb:
[authservice2|1] (2) [authservice2, authservice1]
>>> 2018-08-30 19:41:43,563 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-7) ISPN000094: Received new cluster view for channel ejb:
[authservice2|1] (2) [authservice2, authservice1]
>>> 2018-08-30 19:41:43,563 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-2) ISPN000094: Received new cluster view for channel ejb:
[authservice2|1] (2) [authservice2, authservice1]
>>> 2018-08-30 19:41:43,563 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-1) ISPN000094: Received new cluster view for channel ejb:
[authservice2|1] (2) [authservice2, authservice1]
>>> 2018-08-30 21:16:23,261 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-2)
ISPN000094: Received new cluster view for channel ejb: [authservice1|2] (1)
[authservice1]
>>> 2018-08-30 21:16:23,264 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-2)
ISPN000094: Received new cluster view for channel ejb: [authservice1|2] (1)
[authservice1]
>>> 2018-08-30 21:16:23,330 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-2)
ISPN000094: Received new cluster view for channel ejb: [authservice1|2] (1)
[authservice1]
>>> 2018-08-30 21:16:23,330 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-2)
ISPN000094: Received new cluster view for channel ejb: [authservice1|2] (1)
[authservice1]
>>> 2018-08-30 21:16:23,344 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-2)
ISPN000094: Received new cluster view for channel ejb: [authservice1|2] (1)
[authservice1]
>>> 2018-08-30 21:17:36,994 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-2) ISPN000094: Received new cluster view for channel ejb:
[authservice2|1] (2) [authservice2, authservice1]
>>> 2018-08-30 21:17:36,995 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-4) ISPN000094: Received new cluster view for channel ejb:
[authservice2|1] (2) [authservice2, authservice1]
>>> 2018-08-30 21:17:36,994 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-1) ISPN000094: Received new cluster view for channel ejb:
[authservice2|1] (2) [authservice2, authservice1]
>>> 2018-08-30 21:17:36,995 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-6) ISPN000094: Received new cluster view for channel ejb:
[authservice2|1] (2) [authservice2, authservice1]
>>> 2018-08-30 21:17:36,995 INFO
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-3) ISPN000094: Received new cluster view for channel ejb:
[authservice2|1] (2) [authservice2, authservice1]
>>>
>>>
>>>> Wiadomość napisana przez Rafael Weingärtner <
rafaelweingartner(a)gmail.com <mailto:rafaelweingartner@gmail.com>> w dniu
31.08.2018, o godz. 11:52:
>>>>
>>>> Can you check if your cluster if being formed? You can grep the word
"view" in Keycloak's logs.
>>>>
>>>> On Fri, Aug 31, 2018 at 3:16 AM, Piotr Wojcieszonek <
p.wojcieszonek(a)bosmanager.com <mailto:p.wojcieszonek@bosmanager.com>>
wrote:
>>>> Yes I did:
>>>> <distributed-cache name="sessions"
mode="SYNC"
owners="2"/>
>>>> <distributed-cache
name="authenticationSessions"
mode="SYNC" owners="2"/>
>>>> <distributed-cache name="offlineSessions"
mode="SYNC"
owners="2"/>
>>>> <distributed-cache name="clientSessions"
mode="SYNC"
owners="2"/>
>>>> <distributed-cache
name="offlineClientSessions"
mode="SYNC" owners="2"/>
>>>> <distributed-cache name="loginFailures"
mode="SYNC"
owners="2”/>
>>>>
>>>> I have two node cluster.
>>>>
>>>>
>>>>
>>>>> Wiadomość napisana przez Rafael Weingärtner <
rafaelweingartner(a)gmail.com <mailto:rafaelweingartner@gmail.com>> w dniu
31.08.2018, o godz. 03:01:
>>>>>
>>>>> Did you configure the "owners" parameter in the
standalone-ha.xml
with a value bigger than 1?
>>>>>
>>>>> On Thu, Aug 30, 2018 at 5:06 PM, Piotr Wojcieszonek <
p.wojcieszonek(a)bosmanager.com <mailto:p.wojcieszonek@bosmanager.com>>
wrote:
>>>>> Hi,
>>>>> I have problem with standalone cluster. After cluster start working
it begin to print that kind of information to log file:
>>>>>
>>>>> 2018-08-30 22:04:33,725 WARN
[org.infinispan.expiration.impl.ClusterExpirationManager]
(expiration-thread--p22-t1) ISPN000026: Caught exception purging data
container!: org.infinispan.util.concurrent.TimeoutException: Replication
timeout for authservice1
>>>>> at
org.infinispan.remoting.transport.jgroups.JGroupsTransport.checkRsp(JGroupsTransport.java:827)
>>>>> at
org.infinispan.remoting.transport.jgroups.JGroupsTransport.lambda$invokeRemotelyAsync$0(JGroupsTransport.java:628)
>>>>> at
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
>>>>> at
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
>>>>> at
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>>>>> at
java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
>>>>> at
org.infinispan.remoting.transport.jgroups.SingleResponseFuture.call(SingleResponseFuture.java:46)
>>>>> at
org.infinispan.remoting.transport.jgroups.SingleResponseFuture.call(SingleResponseFuture.java:17)
>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>>>> at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>>> at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>> at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>>>> at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>>>> at java.lang.Thread.run(Thread.java:748)
>>>>>
>>>>> 2018-08-30 22:04:39,966 ERROR
[org.infinispan.interceptors.InvocationContextInterceptor] (Timer-2)
ISPN000136: Error executing command PutKeyValueCommand, writing keys
[task::ClearExpiredClientInitialAccessTokens]:
org.infinispan.util.concurrent.TimeoutException: Replication timeout for
authservice1
>>>>> at
org.infinispan.remoting.transport.jgroups.JGroupsTransport.checkRsp(JGroupsTransport.java:827)
>>>>> at
org.infinispan.remoting.transport.jgroups.JGroupsTransport.lambda$invokeRemotelyAsync$0(JGroupsTransport.java:628)
>>>>> at
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
>>>>> at
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
>>>>> at
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>>>>> at
java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
>>>>> at
org.infinispan.remoting.transport.jgroups.SingleResponseFuture.call(SingleResponseFuture.java:46)
>>>>> at
org.infinispan.remoting.transport.jgroups.SingleResponseFuture.call(SingleResponseFuture.java:17)
>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>>>> at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>>> at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>> at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>>>> at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>>>> at java.lang.Thread.run(Thread.java:748)
>>>>>
>>>>> 2018-08-30 22:04:52,605 ERROR
[org.infinispan.interceptors.InvocationContextInterceptor] (Timer-2)
ISPN000136: Error executing command PutKeyValueCommand, writing keys
[task::ClearExpiredClientInitialAccessTokens]:
org.infinispan.util.concurrent.TimeoutException: Replication timeout for
authservice1
>>>>> at
org.infinispan.remoting.transport.jgroups.JGroupsTransport.checkRsp(JGroupsTransport.java:827)
>>>>> at
org.infinispan.remoting.transport.jgroups.JGroupsTransport.lambda$invokeRemotelyAsync$0(JGroupsTransport.java:628)
>>>>> at
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
>>>>> at
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
>>>>> at
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>>>>> at
java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
>>>>> at
org.infinispan.remoting.transport.jgroups.SingleResponseFuture.call(SingleResponseFuture.java:46)
>>>>> at
org.infinispan.remoting.transport.jgroups.SingleResponseFuture.call(SingleResponseFuture.java:17)
>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>>>> at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>>> at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>> at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>>>> at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>>>> at java.lang.Thread.run(Thread.java:748)
>>>>>
>>>>> And in time it start to work very slow.
>>>>> What can by a reason for that?
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> keycloak-user mailing list
>>>>> keycloak-user(a)lists.jboss.org
<mailto:keycloak-user@lists.jboss.org>
>>>>>
https://lists.jboss.org/mailman/listinfo/keycloak-user <
https://lists.jboss.org/mailman/listinfo/keycloak-user>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Rafael Weingärtner
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Rafael Weingärtner
>>>
>>>
>>>
>>>
>>> --
>>> Rafael Weingärtner
>>
>>
>>
>>
>> --
>> Rafael Weingärtner
>
>
>
>
> --
> Rafael Weingärtner
_______________________________________________
keycloak-user mailing list
keycloak-user(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/keycloak-user