[keycloak-user] Standalone HA tokens not immediately shared among nodes

Mon Sep 17 12:28:26 EDT 2018

Hmm ... maybe the lb is pinging the port? I'm running dockercloud/haproxy,
which autodetects open ports. However, I'm excluding port 7600 so that it
doesn't try to route application requests to JGroups ports. The
SocketTimeoutException only happens once at startup, though. I don't see it
later when I start running auth tests.

Thanks for the pointer to Infinispan statistics query. I ran it for both
nodes and saved the results in the "ispn-cluster-query" subdirectory in
(previous shared)
https://drive.google.com/drive/folders/1AiyLtTXu2AxEbVBdR-5kfJLxqoYladBn?usp=sharing
:
- "start" prefix is for the output of the query right after starting the
nodes. Node1 starts first, then node2.
- "first-auth" is the initial grant_type=password auth. In this set of
tests it was done on node1.
- "refresh-auth" is the subsequent failing grant_type=refresh_token . It's
successful on node1 and failing on node2.
- "post-node2-auth" is after grant_type=password auth is executed on node2
(which brings the cluster in-sync).

I couldn't spot any issues in the output with my untrained eyes. I wonder,
should the statistics be pulled from the sessions distributed cache as
well? Is that the one that would be consulted during
grant_type=refresh_token auth?

Thanks,
DV

On Mon, Sep 17, 2018 at 4:27 AM Sebastian Laskawiec <slaskawi at redhat.com>
wrote:

> So the only thing that look suspicious is this:
> JGRP000006: failed accepting connection from peer:
> java.net.SocketTimeoutException: Read timed out
>
> It might indicate that some other application tried to connect to Keycloak
> on port 7600 and immediately disconnected. That leads to a question on your
> environment, are you sure you are looking into proper applications servers?
> Perhaps some other applications (Wildfly for example, since Keycloak is
> built on Wildfly) are trying to join the cluster.
>
> However, if the answer is yes, the next thing to check are Infinispan
> statistics over JMX or JBoss CLI. Here's a sample query you may use:
> /subsystem=infinispan/cache-container=keycloak/replicated-cache=*:query
> And then have a look at number of entries and number of entries in the
> cluster.
>
> @Marek Posolda <mposolda at redhat.com>. perhaps this ring you any bell?
> ISPN seems fine here (at least from the logs and symptoms DV's describing.
>
> On Thu, Sep 13, 2018 at 6:53 PM D V <dv at glyphy.com> wrote:
>
>> Weird indeed. Yes, the logs indicate two nodes. I've uploaded the full
>> start-up logs here:
>> https://drive.google.com/drive/folders/1AiyLtTXu2AxEbVBdR-5kfJLxqoYladBn?usp=sharing
>> . I started node 1, let it settle, then started node 2. You can see that
>> node1 starts with just itself, but later node2 joins the cluster and caches
>> are rebalanced.
>>
>> As for the experiment, I tried waiting for a few minutes after both nodes
>> started in case there's some synchronization delay somewhere, but it didn't
>> change the outcome.
>>
>> Thanks,
>> DV
>>
>> On Wed, Sep 12, 2018 at 3:22 AM Sebastian Laskawiec <slaskawi at redhat.com>
>> wrote:
>>
>>> Hmmm this sounds a bit weird... like there was some delay in the
>>> communication path.
>>>
>>> Could you please look through your logs and look for lines including
>>> "view" keyword? Are there two nodes, as expected? How the timestamps relate
>>> to your experiment?
>>>
>>