[JBoss JIRA] (WFLY-11682) Clustered SLSB membership anomalies when all cluster members removed
by Richard Achmatowicz (Jira)
[ https://issues.jboss.org/browse/WFLY-11682?page=com.atlassian.jira.plugin... ]
Richard Achmatowicz edited comment on WFLY-11682 at 2/11/19 11:55 AM:
----------------------------------------------------------------------
Managed to recreate the error with Jorg's reproducer.
It looks as though nodes which are kicked out of the cluster do not get a chance to send updates to the client before being kicked out. So the last node to leave has no opportunity to advise the client of its leaving. Which is why the client believes the last node to leave is still alive. In the attached logs, node3 sees removal of client mappings entries entries for node1 and node2, but does not receive notification of its own client mapping entries.
The server side code has changed a lot due to the new EJBClient/Elytron/Remoting implementation, and some EJB client related features which did appear there (in VersionOneChannelProtocolHandler) were not ported over (to AssociationImpl); specifically EJB client related responses to suspend and resume. These should be added back in.
This is one place where it would be possible to send a notification to a client if the last node was going down (set a flag indicating we are the last node and we are being suspended; if the server is not resumed, use the flag to send a message before the EJBServerChannel connections to the clients are shut down).
was (Author: rachmato):
Managed to recreate the error with Jorg's reproducer.
It looks as though nodes which are kicked out of the cluster do not get a chance to send updates to the client before being kicked out. So the last node to leave has no opportunity to advise the client of its leaving. Which is why the client believes the last node to leave is still alive.
The server side code has changed a lot due to the new EJBClient/Elytron/Remoting implementation, and some EJB client related features which did appear there (in VersionOneChannelProtocolHandler) were not ported over (to AssociationImpl); specifically EJB client related responses to suspend and resume. These should be added back in.
This is one place where it would be possible to send a notification to a client if the last node was going down (set a flag indicating we are the last node and we are being suspended; if the server is not resumed, use the flag to send a message before the EJBServerChannel connections to the clients are shut down).
> Clustered SLSB membership anomalies when all cluster members removed
> --------------------------------------------------------------------
>
> Key: WFLY-11682
> URL: https://issues.jboss.org/browse/WFLY-11682
> Project: WildFly
> Issue Type: Bug
> Components: Clustering, EJB
> Affects Versions: 15.0.1.Final
> Environment: WildFly running in an n-node cluster with an EJB client sending requests even during the time the cluster is down.
> Reporter: Jörg Bäsner
> Assignee: Richard Achmatowicz
> Priority: Major
> Attachments: node1.txt, node12.txt, node2.txt, node3.txt, playground.zip
>
>
> This description will be based on a 3 node cluster. Cluster node 1 and 2 are configured in the {{PROVIDER_URL}}, node 3 is not.
> The client has a custom ClusterNodeSelector implementation that is printing the {{connectedNodes}} and the {{availableNodes}} and doing a random balancing.
> As long as all nodes are up and running the client is calling EJBs in a balanced way.
> When node1 is shut down, the client get the notification below:
> {code}...
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-4) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-4) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received CLUSTER_TOPOLOGY_NODE_REMOVAL(18) message for (cluster, node) = (ejb, node1)
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received CLUSTER_TOPOLOGY_NODE_REMOVAL(18) message for (cluster, node) = (ejb, node1)
> ...
> {code}
> Then node2 is shut down. Again the client get the information, see:
> {code}
> ...
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received CLUSTER_TOPOLOGY_NODE_REMOVAL(18) message for (cluster, node) = (ejb, node2)
> ...
> {code}
> Finally node3 is being shut down. Now the client only get the following information:
> {code}
> ...
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> ...
> {code}
> This mean the _node3_ is not being informed about the fact that the last node of the cluster has been stopped.
> From this point on the client is always getting {{Caused by: java.net.ConnectException: Connection refused}}
> Now node1 is started again, resulting in the following output for {{connectedNodes}} and the {{availableNodes}}:
> {code}
> ...
> INFO (ThreadPoolTaskExecutor-1) [com.jboss.examples.ejb.CustomClusterNodeSelector] connectedNodes(1) '[node1]', availableNodes(2) '[node3, node1]'
> ...
> {code}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 7 months
[JBoss JIRA] (WFLY-11682) Clustered SLSB membership anomalies when all cluster members removed
by Richard Achmatowicz (Jira)
[ https://issues.jboss.org/browse/WFLY-11682?page=com.atlassian.jira.plugin... ]
Richard Achmatowicz updated WFLY-11682:
---------------------------------------
Attachment: (was: Notes.txt)
> Clustered SLSB membership anomalies when all cluster members removed
> --------------------------------------------------------------------
>
> Key: WFLY-11682
> URL: https://issues.jboss.org/browse/WFLY-11682
> Project: WildFly
> Issue Type: Bug
> Components: Clustering, EJB
> Affects Versions: 15.0.1.Final
> Environment: WildFly running in an n-node cluster with an EJB client sending requests even during the time the cluster is down.
> Reporter: Jörg Bäsner
> Assignee: Richard Achmatowicz
> Priority: Major
> Attachments: node1.txt, node12.txt, node2.txt, node3.txt, playground.zip
>
>
> This description will be based on a 3 node cluster. Cluster node 1 and 2 are configured in the {{PROVIDER_URL}}, node 3 is not.
> The client has a custom ClusterNodeSelector implementation that is printing the {{connectedNodes}} and the {{availableNodes}} and doing a random balancing.
> As long as all nodes are up and running the client is calling EJBs in a balanced way.
> When node1 is shut down, the client get the notification below:
> {code}...
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-4) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-4) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received CLUSTER_TOPOLOGY_NODE_REMOVAL(18) message for (cluster, node) = (ejb, node1)
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received CLUSTER_TOPOLOGY_NODE_REMOVAL(18) message for (cluster, node) = (ejb, node1)
> ...
> {code}
> Then node2 is shut down. Again the client get the information, see:
> {code}
> ...
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received CLUSTER_TOPOLOGY_NODE_REMOVAL(18) message for (cluster, node) = (ejb, node2)
> ...
> {code}
> Finally node3 is being shut down. Now the client only get the following information:
> {code}
> ...
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> ...
> {code}
> This mean the _node3_ is not being informed about the fact that the last node of the cluster has been stopped.
> From this point on the client is always getting {{Caused by: java.net.ConnectException: Connection refused}}
> Now node1 is started again, resulting in the following output for {{connectedNodes}} and the {{availableNodes}}:
> {code}
> ...
> INFO (ThreadPoolTaskExecutor-1) [com.jboss.examples.ejb.CustomClusterNodeSelector] connectedNodes(1) '[node1]', availableNodes(2) '[node3, node1]'
> ...
> {code}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 7 months
[JBoss JIRA] (WFLY-11682) Clustered SLSB membership anomalies when all cluster members removed
by Richard Achmatowicz (Jira)
[ https://issues.jboss.org/browse/WFLY-11682?page=com.atlassian.jira.plugin... ]
Richard Achmatowicz updated WFLY-11682:
---------------------------------------
Attachment: Notes.txt
node12.txt
node3.txt
node2.txt
node1.txt
> Clustered SLSB membership anomalies when all cluster members removed
> --------------------------------------------------------------------
>
> Key: WFLY-11682
> URL: https://issues.jboss.org/browse/WFLY-11682
> Project: WildFly
> Issue Type: Bug
> Components: Clustering, EJB
> Affects Versions: 15.0.1.Final
> Environment: WildFly running in an n-node cluster with an EJB client sending requests even during the time the cluster is down.
> Reporter: Jörg Bäsner
> Assignee: Richard Achmatowicz
> Priority: Major
> Attachments: Notes.txt, node1.txt, node12.txt, node2.txt, node3.txt, playground.zip
>
>
> This description will be based on a 3 node cluster. Cluster node 1 and 2 are configured in the {{PROVIDER_URL}}, node 3 is not.
> The client has a custom ClusterNodeSelector implementation that is printing the {{connectedNodes}} and the {{availableNodes}} and doing a random balancing.
> As long as all nodes are up and running the client is calling EJBs in a balanced way.
> When node1 is shut down, the client get the notification below:
> {code}...
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-4) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-4) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received CLUSTER_TOPOLOGY_NODE_REMOVAL(18) message for (cluster, node) = (ejb, node1)
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received CLUSTER_TOPOLOGY_NODE_REMOVAL(18) message for (cluster, node) = (ejb, node1)
> ...
> {code}
> Then node2 is shut down. Again the client get the information, see:
> {code}
> ...
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received CLUSTER_TOPOLOGY_NODE_REMOVAL(18) message for (cluster, node) = (ejb, node2)
> ...
> {code}
> Finally node3 is being shut down. Now the client only get the following information:
> {code}
> ...
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
> ...
> {code}
> This mean the _node3_ is not being informed about the fact that the last node of the cluster has been stopped.
> From this point on the client is always getting {{Caused by: java.net.ConnectException: Connection refused}}
> Now node1 is started again, resulting in the following output for {{connectedNodes}} and the {{availableNodes}}:
> {code}
> ...
> INFO (ThreadPoolTaskExecutor-1) [com.jboss.examples.ejb.CustomClusterNodeSelector] connectedNodes(1) '[node1]', availableNodes(2) '[node3, node1]'
> ...
> {code}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 7 months
[JBoss JIRA] (WFCORE-4309) Value validator for 'host-context-map' attribute of 'server-ssl-sni-context' resource
by Diana Vilkolakova (Jira)
[ https://issues.jboss.org/browse/WFCORE-4309?page=com.atlassian.jira.plugi... ]
Diana Vilkolakova reassigned WFCORE-4309:
-----------------------------------------
Assignee: Diana Vilkolakova
> Value validator for 'host-context-map' attribute of 'server-ssl-sni-context' resource
> -------------------------------------------------------------------------------------
>
> Key: WFCORE-4309
> URL: https://issues.jboss.org/browse/WFCORE-4309
> Project: WildFly Core
> Issue Type: Bug
> Components: Security
> Affects Versions: 7.0.0.Final
> Reporter: Jan Stourac
> Assignee: Diana Vilkolakova
> Priority: Minor
>
> There is not validation for 'host-context-map' property values on key side. There is validation for the values that represents 'server-ssl-contexts', although, there is no validation for host matching part. E.g. writing attribute of this value is possible:
> {code}
> /subsystem=elytron/server-ssl-sni-context=serverSslSniCtx:write-attribute(name=host-context-map,value={"\\?.example.com"=validSslContext,"..example.com"="validSslContext", "\\*\\*.example.com"=validSslContext})
> {code}
> {code}
> "\\?.example.com"
> "..example.com"
> "\\*\\*.example.com"
> {code}
> even though, these are invalid host name matchers IMHO. It would be nice to identify these and report those to user immediately during the configuration attempt.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 7 months