[jboss-jira] [JBoss JIRA] (WFLY-11682) Clustered SLSB membership anomalies when all cluster members removed

Jörg Bäsner (Jira) issues at jboss.org
Thu Feb 7 11:10:03 EST 2019


     [ https://issues.jboss.org/browse/WFLY-11682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jörg Bäsner updated WFLY-11682:
-------------------------------
    Description: 
This description will be based on a 3 node cluster. Cluster node 1 and 2 are configured in the {{PROVIDER_URL}}, node 3 is not.

The client has a custom ClusterNodeSelector implementation that is printing the {{connectedNodes}} and the {{availableNodes}} and doing a random balancing.

As long as all nodes are up and running the client is calling EJBs in a balanced way.

When node1 is shut down, the client get the notification below:

{code}...
DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
DEBUG (XNIO-1 task-4) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
DEBUG (XNIO-1 task-4) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received CLUSTER_TOPOLOGY_NODE_REMOVAL(18) message for (cluster, node) = (ejb, node1)
DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received CLUSTER_TOPOLOGY_NODE_REMOVAL(18) message for (cluster, node) = (ejb, node1)
...
{code}

Then node2 is shut down. Again the client get the information, see:
{code}
...
DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received CLUSTER_TOPOLOGY_NODE_REMOVAL(18) message for (cluster, node) = (ejb, node2)
...
{code}

Finally node3 is being shut down. Now the client only get the following information:
{code}
...
DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
...
{code}

This mean the _node3_ is not being informed about the fact that the last node of the cluster has been stopped.

>From this point on the client is always getting {{Caused by: java.net.ConnectException: Connection refused}}

Now node1 is started again, resulting in the following output for {{connectedNodes}} and the {{availableNodes}}:
{code}
...
INFO  (ThreadPoolTaskExecutor-1) [com.jboss.examples.ejb.CustomClusterNodeSelector] connectedNodes(1) '[node1]', availableNodes(2) '[node3, node1]'
...
{code}

  was:
This description will be based on a 3 node cluster. Cluster node 1 and 2 are configured in the {{PROVIDER_URL}}, node 3 is not.

The client has a custom ClusterNodeSelector implementation that is printing the {{connectedNodes}} and the {{availableNodes}} and doing a random balancing.

As long as all nodes are up and running the client is calling EJBs in a balanced way.

When node1 is shut down, the client get the notification below:

{code}...
DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
DEBUG (XNIO-1 task-4) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
DEBUG (XNIO-1 task-4) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received CLUSTER_TOPOLOGY_NODE_REMOVAL(18) message for (cluster, node) = (ejb, node1)
DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received CLUSTER_TOPOLOGY_NODE_REMOVAL(18) message for (cluster, node) = (ejb, node1)
...
{code}

Then node2 is shut down. Again the client get the information, see:
{code}
...
DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received CLUSTER_TOPOLOGY_NODE_REMOVAL(18) message for (cluster, node) = (ejb, node2)
...
{code}

Finally node3 is being shut down. Now the client only get the following information:
{code}
...
DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
...
{code}

This mean the _node3_ is not being informed about the fact that the last node of the cluster has been stopped.

>From this point on the client is always getting {{Caused by: java.net.ConnectException: Connection refused}}

Now node1 is started again, resulting in the following output for {{connectedNodes}} and the {{availableNodes}}:
{code}
...
INFO  (ThreadPoolTaskExecutor-1) [com.jboss.examples.ejb.LBClusterNodeSelector] connectedNodes(1) '[node1]', availableNodes(2) '[node3, node1]'
...
{code}



> Clustered SLSB membership anomalies when all cluster members removed
> --------------------------------------------------------------------
>
>                 Key: WFLY-11682
>                 URL: https://issues.jboss.org/browse/WFLY-11682
>             Project: WildFly
>          Issue Type: Bug
>          Components: Clustering, EJB
>    Affects Versions: 15.0.1.Final
>         Environment: WildFly running in an n-node cluster with an EJB client sending requests even during the time the cluster is down.
>            Reporter: Jörg Bäsner
>            Assignee: Richard Achmatowicz
>            Priority: Major
>         Attachments: playground.zip
>
>
> This description will be based on a 3 node cluster. Cluster node 1 and 2 are configured in the {{PROVIDER_URL}}, node 3 is not.
> The client has a custom ClusterNodeSelector implementation that is printing the {{connectedNodes}} and the {{availableNodes}} and doing a random balancing.
> As long as all nodes are up and running the client is calling EJBs in a balanced way.
> When node1 is shut down, the client get the notification below:
> {code}...
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-4) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-4) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received CLUSTER_TOPOLOGY_NODE_REMOVAL(18) message for (cluster, node) = (ejb, node1)
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received CLUSTER_TOPOLOGY_NODE_REMOVAL(18) message for (cluster, node) = (ejb, node1)
> ...
> {code}
> Then node2 is shut down. Again the client get the information, see:
> {code}
> ...
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received CLUSTER_TOPOLOGY_NODE_REMOVAL(18) message for (cluster, node) = (ejb, node2)
> ...
> {code}
> Finally node3 is being shut down. Now the client only get the following information:
> {code}
> ...
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> ...
> {code}
> This mean the _node3_ is not being informed about the fact that the last node of the cluster has been stopped.
> From this point on the client is always getting {{Caused by: java.net.ConnectException: Connection refused}}
> Now node1 is started again, resulting in the following output for {{connectedNodes}} and the {{availableNodes}}:
> {code}
> ...
> INFO  (ThreadPoolTaskExecutor-1) [com.jboss.examples.ejb.CustomClusterNodeSelector] connectedNodes(1) '[node1]', availableNodes(2) '[node3, node1]'
> ...
> {code}



--
This message was sent by Atlassian Jira
(v7.12.1#712002)



More information about the jboss-jira mailing list