[keycloak-user] Keycloak Domain TCP Clustering of Sessions in AWS not auto-failing over to node

Tue Aug 27 17:57:29 EDT 2019

Thanks,

 I've read over numerous articles to include the ones you listed and I've
still been unable to location why they are not clustering.

When I run this command I get the following output:
/profile=full-ha/subsystem=jgroups/channel=ee:write-attribute(name=stack,value=tcpping)
{
    "outcome" => "success",
    "result" => undefined,
    "server-groups" => {"auth-server-group" => {"host" => {
        "dev-master" => {"dev-master" => {"response" => {"outcome" =>
"success"}}},
        "dev--slave1" => {"dev-slave1" => {"response" => {
            "outcome" => "success",
            "result" => undefined
        }}}
    }}}
}

Is that the normal output for the slave?

Here is some other information from the logs:
>From the Master Node
[org.jgroups.protocols.pbcast.GMS] (ServerService Thread Pool -- 50)
dev-master: no members discovered after 3004 ms: creating cluster as first
member
This was with the Slave1 node up and running:
[Host Controller] 21:35:53,349 INFO  [org.jboss.as.host.controller] (Host
Controller Service Threads - 2) WFLYHC0148: Connected to master host
controller at remote://10.10.10.77:9999

On Master:
[org.infinispan.factories.GlobalComponentRegistry] (MSC service thread 1-2)
ISPN000128: Infinispan version: Infinispan 'Infinity Minus ONE +2'
9.4.8.Final
[Server:dev-master] 21:36:02,081 INFO
 [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-4) ISPN000078: Starting JGroups channel ee
[Server:dev-master] 21:36:02,086 INFO
 [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-2) ISPN000078: Starting JGroups channel ee
[Server:dev-master] 21:36:02,087 INFO
 [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-1) ISPN000078: Starting JGroups channel ee
[Server:dev-master] 21:36:02,089 INFO  [org.infinispan.CLUSTER] (MSC
service thread 1-4) ISPN000094: Received new cluster view for channel ee:
[dev-master|0] (1) [dev-master]
[Server:dev-master] 21:36:02,090 INFO
 [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-3) ISPN000078: Starting JGroups channel ee
[Server:dev-master] 21:36:02,091 INFO  [org.infinispan.CLUSTER] (MSC
service thread 1-1) ISPN000094: Received new cluster view for channel ee:
[dev-master|0] (1) [dev-master]
[Server:dev-master] 21:36:02,091 INFO  [org.infinispan.CLUSTER] (MSC
service thread 1-2) ISPN000094: Received new cluster view for channel ee:
[dev-master|0] (1) [dev-master]
[Server:dev-master] 21:36:02,091 INFO  [org.infinispan.CLUSTER] (MSC
service thread 1-3) ISPN000094: Received new cluster view for channel ee:
[dev-master|0] (1) [dev-master]
[Server:dev-master] 21:36:02,104 INFO
 [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-2) ISPN000079: Channel ee local address is dev-master, physical
addresses are [10.10.10.77:7600]
[Server:dev-master] 21:36:02,129 INFO
 [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-1) ISPN000079: Channel ee local address is dev-master, physical
addresses are [10.10.10.77:7600]
[Server:dev-master] 21:36:02,149 INFO
 [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-4) ISPN000079: Channel ee local address is dev-master, physical
addresses are [10.10.10.77:7600]
[Server:dev-master] 21:36:02,151 INFO
 [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-3) ISPN000079: Channel ee local address is dev-master, physical
addresses are [10.10.10.77:7600]
[Server:dev-master] 21:36:02,296 INFO
 [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-2) ISPN000078: Starting JGroups channel ee
[Server:dev-master] 21:36:02,297 INFO  [org.infinispan.CLUSTER] (MSC
service thread 1-2) ISPN000094: Received new cluster view for channel ee:
[dev-master|0] (1) [dev-master]
[Server:dev-master] 21:36:02,325 INFO
 [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service
thread 1-2) ISPN000079: Channel ee local address is dev-master, physical
addresses are [10.10.10.77:7600]

This is the configuration from the domain.xml on the master
                <channels default="ee">
                    <channel name="ee" stack="tcp"/>
                    <channel name="ee" stack="tcpping"/>
                </channels>
                <stacks>
                    <stack name="tcp">
                        <transport type="TCP" socket-binding="jgroups-tcp"/>
                        <protocol type="MERGE3"/>
                        <protocol type="FD_SOCK"/>
                        <protocol type="FD_ALL"/>
                        <protocol type="VERIFY_SUSPECT"/>
                        <protocol type="pbcast.NAKACK2"/>
                        <protocol type="UNICAST3"/>
                        <protocol type="pbcast.STABLE"/>
                        <protocol type="pbcast.GMS"/>
                        <protocol type="MFC"/>
                        <protocol type="FRAG3"/>
                    </stack>
                    <stack name="tcpping">
                        <transport type="TCP" socket-binding="jgroups-tcp"/>
                        <protocol type="org.jgroups.protocols.TCPPING">
                            <property name="initial_hosts">
                                ${jboss.cluster.tcp.initial_hosts}
                            </property>
                            <property name="port_range">
                                0
                            </property>
                        </protocol>
                        <protocol type="MERGE3"/>
                        <protocol type="FD_SOCK"/>
                        <protocol type="FD_ALL"/>
                        <protocol type="VERIFY_SUSPECT"/>
                        <protocol type="pbcast.NAKACK2"/>
                        <protocol type="UNICAST3"/>
                        <protocol type="pbcast.STABLE"/>
                        <protocol type="pbcast.GMS"/>
                        <protocol type="MFC"/>
                        <protocol type="FRAG3"/>
                    </stack>

    <server-groups>
        <server-group name="auth-server-group" profile="full-ha">
            <jvm name="default">
                <heap size="64m" max-size="512m"/>
            </jvm>
            <socket-binding-group ref="ha-sockets"/>
            <system-properties>
                <property name="jboss.cluster.tcp.initial_hosts"
value="10.10.10.77[7600],10.10.11.27[7600]"/>
            </system-properties>
        </server-group>
    </server-groups>

This is a question which I find conflicting information when reading
RedHat/JBOSS/WildFly
 From the Master, this is the host.xml file
 I've had both the master and the slave listed here. Which do I need or do
I need both?  And the same goes for configuring on the Slave's host.xml file

    <servers>
        <server name="dev-slave1" group="auth-server-group"
auto-start="true">
            <socket-bindings port-offset="0"/>
        </server>
    </servers>

The same question would apply to the host-master.xml on the Master and the
host-slave.xml on the Slave in reference to:
This is from the host-master.xml on the Master
    <servers>
        <server name="dev-sentinel-master" group="auth-server-group"
auto-start="true">
            <socket-bindings port-offset="0"/>
        </server>
    </servers>

Included screenshot showing WildFly Management Console with both servers up
and green in the auth-server-group of the full-ha profile
https://i.imgur.com/8g124Ss.png

I know I'm just missing something small, and I'm not getting any errors in
the logs. Is there anyway to get more TRACE or DEBUG in regards to
clustering?

Thanks!

On Tue, Aug 27, 2019 at 5:09 AM Sebastian Laskawiec <slaskawi at redhat.com>
wrote:

>
>
> On Mon, Aug 26, 2019 at 5:32 PM JTK <jonesy at sydow.org> wrote:
>
>> I have two nodes setup in a cluster using TCP port 7600 and I see them
>> join
>> the cluster in the logs.
>> On Master: [Host Controller] 15:07:18,293 INFO
>>  [org.jboss.as.domain.controller] (Host Controller Service Threads - 7)
>> WFLYHC0019: Registered remote slave host "dev-slave1", JBoss Keycloak
>> 6.0.1
>> (WildFly 8.0.0.Final)
>> On Slave: [Host Controller] 15:03:12,603 INFO
>>  [org.jboss.as.host.controller] (Host Controller Service Threads - 3)
>> WFLYHC0148: Connected to master host controller at remote://
>> 10.10.10.77:9999
>>
>> In the WildFly admin panel I see the server group: auth-server-group which
>> is ha and then I see both servers in the group and they are both green.
>>
>> I've set the distributed-cache setup to 2 in domain.xml, so it should be
>> sharing session information:
>>                     <distributed-cache name="sessions" owners="2"/>
>>                     <distributed-cache name="authenticationSessions"
>> owners="2"/>
>>                     <distributed-cache name="offlineSessions" owners="2"/>
>>                     <distributed-cache name="clientSessions" owners="2"/>
>>                     <distributed-cache name="offlineClientSessions"
>> owners="2"/>
>>                     <distributed-cache name="loginFailures" owners="2"/>
>>                     <distributed-cache name="actionTokens" owners="2">
>>
>> Here is the logs on the master showing there a new cluster has been
>> received:
>> 2019-08-26 15:03:19,776 INFO  [org.infinispan.CLUSTER] (MSC service thread
>> 1-1) ISPN000094: Received new cluster view for channel ejb: [dev-master|0]
>> (1) [dev-master]
>> 2019-08-26 15:03:19,779 INFO  [org.infinispan.CLUSTER] (MSC service thread
>> 1-3) ISPN000094: Received new cluster view for channel ejb: [dev-master|0]
>> (1) [dev-master]
>> 2019-08-26 15:03:19,780 INFO  [org.infinispan.CLUSTER] (MSC service thread
>> 1-2) ISPN000094: Received new cluster view for channel ejb: [dev-master|0]
>> (1) [dev-master]
>> 2019-08-26 15:03:19,780 INFO  [org.infinispan.CLUSTER] (MSC service thread
>> 1-4) ISPN000094: Received new cluster view for channel ejb: [dev-master|0]
>> (1) [dev-master]
>> 2019-08-26 15:03:19,875 INFO  [org.infinispan.CLUSTER] (MSC service thread
>> 1-1) ISPN000094: Received new cluster view for channel ejb: [dev-master|0]
>> (1) [dev-master]
>>
>> And on the slave:
>> 2019-08-26 15:07:29,567 INFO  [org.infinispan.CLUSTER] (MSC service thread
>> 1-2) ISPN000094: Received new cluster view for channel ejb: [dev-slave1|0]
>> (1) [dev-slave1]
>> 2019-08-26 15:07:29,572 INFO  [org.infinispan.CLUSTER] (MSC service thread
>> 1-3) ISPN000094: Received new cluster view for channel ejb: [dev-slave1|0]
>> (1) [dev-slave1]
>> 2019-08-26 15:07:29,572 INFO  [org.infinispan.CLUSTER] (MSC service thread
>> 1-4) ISPN000094: Received new cluster view for channel ejb: [dev-slave1|0]
>> (1) [dev-slave1]
>> 2019-08-26 15:07:29,574 INFO  [org.infinispan.CLUSTER] (MSC service thread
>> 1-1) ISPN000094: Received new cluster view for channel ejb: [dev-slave1|0]
>> (1) [dev-slave1]
>> 2019-08-26 15:07:29,635 INFO  [org.infinispan.CLUSTER] (MSC service thread
>> 1-3) ISPN000094: Received new cluster view for channel ejb: [dev-slave1|0]
>> (1) [dev-slave1]
>>
>
> This definitely doesn't look right. The view id (which is increasing
> monotonically) is 0. Which means this is an initial view and none of the
> new members joined. Clearly, the discovery protocol is not configured
> properly and both nodes are in separate (singleton) clusters.
>
>
>> I believe I read somewhere that I was supposed to see the master and slave
>> together in the logs an not just master or slave. Maybe this is my issue,
>> but I don't know how to resolve it.
>>
>> I can't use multi-cast as it's disabled in AWS and almost all cloud
>> providers.
>>
>
> The easiest option is to use TCPPING. However, it requires you to put all
> nodes IPs in its configuration [1]. There are other options as well, e.g.
> S3 Ping [2] and its rewritten (and much better) version - Native S3 Ping
> [3].
>
> You may also be interested in using JDBC_PING. Please have a look at the
> our blogs [4][5].
>
> [1] http://jgroups.org/manual4/index.html#TCPPING_Prot
> [2] http://jgroups.org/manual4/index.html#_s3_ping
> [3] https://github.com/jgroups-extras/native-s3-ping
> [4] https://www.keycloak.org/2019/05/keycloak-cluster-setup.html
> [5] https://www.keycloak.org/2019/08/keycloak-jdbc-ping.html
>
>
>>
>> When I launch the master and let it come up, then launch the slave I can
>> see all the traffic for the session on the master. As soon as I stop the
>> master, the slave is looking for the master, but when I click on the
>> website, it just hangs waiting for a connection and then eventually logs
>> me
>> out, and I end up logging back in, and now I'm on the slave node. The
>> shared sessions are not happening. Is there something else I need to do or
>> set?
>>
>
> It looks like a consequence of the JGroups discovery issue. Please try to
> fix the clustering problem and then see if this one appears again.
>
>
>>
>> I have this setup in my domain.xml configuration as well:
>>         <server-group name="auth-server-group" profile="ha">
>>             <jvm name="default">
>>                 <heap size="64m" max-size="512m"/>
>>             </jvm>
>>             <socket-binding-group ref="ha-sockets"/>
>>             <system-properties>
>>                 <property name="jboss.cluster.tcp.initial_hosts"
>> value="10.10.10.77[7600],10.10.10.27[7600]"/>
>>             </system-properties>
>>         </server-group>
>>
>> In my host.xml on the slave I have this setup to reach back to the master
>> as the domain controller
>>     <domain-controller>
>>         <remote protocol="remote" host="${jboss.domain.master.address}"
>> port="${jboss.domain.master.port:9999}" security-realm="ManagementRealm"/>
>>    </domain-controller>
>>
>> Any help would be appreciated
>> _______________________________________________
>> keycloak-user mailing list
>> keycloak-user at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/keycloak-user
>>
>