[jboss-jira] [JBoss JIRA] (WFLY-3711) Topology updates of EJBClient ClusterContexts not being processed correctly after failover

Richard Achmatowicz (JIRA) issues at jboss.org
Thu Aug 7 13:56:30 EDT 2014


    [ https://issues.jboss.org/browse/WFLY-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991366#comment-12991366 ] 

Richard Achmatowicz commented on WFLY-3711:
-------------------------------------------

Some background on how these topology updates from cluster members are generated:

{noformat}
- use of RegistryCollector and Registry is to keep track of which clusters a node belongs to and membership changes of the clusters
- RegistryCollector
  - the RegistryCollector holds a set of Registry objects and represents the *set* of clusters that a node belongs to
  - implemented by a service on the node, which holds the collection
  - the RegistryCollector has listeners:
    registryAdded
    registryRemoved 
- Registry
  - each Registry represents a cluster that the node is a member of and holds a Map<String,List<ClientMapping>> 
  - each Map.Entry represents a node in that cluster and its local set of ClientMapping elements (outgoing connections)
  - implemented by a distributed cache which holds a Map of node names to List<ClientMapping>  
  - CacheRegistry (the implementation) has three listeners:
    @ToplologyChanged -> only processed on coordinator; calculate the set of old addresses; perform an invocation to remove them from all nodes; notify remove listeners on this node only
    @CacheEntryModified -> addedEntries(Collections.singletonMap(entry.getKey(), entry.getValue())) / updatedEntries(Collections.singletonMap(entry.getKey(), entry.getValue()))
    @CacheEntryRemoved -> removedEntries(Collections.singletonMap(entry.getKey(), entry.getValue()))

- cluster-related update events originate on the server side, due to changes to the caches underlying RegistryCollector and Registry
- these events are interpreted and propagated to the client side where they are used to update the ClientContexts
{noformat}

The operation of these callbacks can be seen from the logs:

{no format}
[nrla at lenovo surefire-reports]$ grep Registry org.jboss.as.test.clustering.cluster.ejb.remote.RemoteFailoverTestCase-SYNC-tcp-output.txt 
// node-0, node-1 start up and deploy clustered app
15:11:18,705 INFO  [stdout] (MSC service thread 1-8) Registry collector service started on node node-0
15:11:25,895 INFO  [stdout] (MSC service thread 1-9) Registry collector service started on node node-1
15:11:36,017 INFO  [stdout] (MSC service thread 1-2) Registry for cluster ejb added on node node-0
15:11:39,950 INFO  [stdout] (MSC service thread 1-10) Registry for cluster ejb added on node node-1

// node-0 undeploys clustered app
15:11:42,320 INFO  [stdout] (MSC service thread 1-2) Registry for cluster ejb removed on node node-0 
15:11:42,320 INFO  [stdout] (MSC service thread 1-2) Registry removed on node node-0  with (key) entries: [node-0, node-1] (same callback as previous message)
15:11:42,326 INFO  [stdout] (remote-thread-0) Registry entries removed on node node-1  for nodes: [node-0]

// node-0 deploys clustered app
15:11:43,180 INFO  [stdout] (remote-thread-0) Registry entries added on node node-1 for nodes: [node-0]
15:11:43,182 INFO  [stdout] (MSC service thread 1-12) Registry for cluster ejb added on node node-0
15:11:43,183 INFO  [stdout] (MSC service thread 1-12) Registry added on node node-0 with (key) entries: [node-0, node-1] (same callback as previous message)

// node-1 shuts down
15:11:48,644 INFO  [stdout] (MSC service thread 1-9) Registry for cluster ejb removed on node node-1
15:11:48,645 INFO  [stdout] (MSC service thread 1-9) Registry removed on node node-1  with (key) entries: [node-0, node-1] (same callback as previous message)
15:11:48,645 INFO  [stdout] (MSC service thread 1-11) Registry collector service stopped on node node-1
15:11:48,648 INFO  [stdout] (remote-thread-0) Registry entries removed on node node-0  for nodes: [node-1]

// node-1 starts up
15:11:55,788 INFO  [stdout] (MSC service thread 1-12) Registry collector service started on node node-1
15:11:57,365 INFO  [stdout] (remote-thread-0) Registry entries added on node node-0 for nodes: [node-1]
15:11:57,369 INFO  [stdout] (MSC service thread 1-16) Registry for cluster ejb added on node node-1

// node-0, node-1 shut down
15:12:12,374 INFO  [stdout] (MSC service thread 1-16) Registry for cluster ejb removed on node node-0
15:12:12,375 INFO  [stdout] (MSC service thread 1-16) Registry removed on node node-0  with (key) entries: [node-0, node-1] (same callback as previous message)
15:12:12,829 INFO  [stdout] (MSC service thread 1-12) Registry for cluster ejb removed on node node-1
15:12:12,829 INFO  [stdout] (MSC service thread 1-12) Registry removed on node node-1  with (key) entries: [node-1]
15:12:12,829 INFO  [stdout] (MSC service thread 1-12) Registry removed on node node-1  with (key) entries: [node-1]
15:12:12,830 INFO  [stdout] (MSC service thread 1-12) Registry removed on node node-1  with (key) entries: [node-1]
15:12:12,830 INFO  [stdout] (MSC service thread 1-12) Registry removed on node node-1  with (key) entries: [node-1]
{noformat}



> Topology updates of EJBClient ClusterContexts not being processed correctly after failover
> ------------------------------------------------------------------------------------------
>
>                 Key: WFLY-3711
>                 URL: https://issues.jboss.org/browse/WFLY-3711
>             Project: WildFly
>          Issue Type: Bug
>      Security Level: Public(Everyone can see) 
>          Components: Clustering
>    Affects Versions: 9.0.0.Beta1
>            Reporter: Richard Achmatowicz
>            Assignee: Richard Achmatowicz
>
> ClusterContexts are used by EJBClient to keep track of the current set of nodes in a cluster, so that if an EJBClient invocation fails on one node, it may failover to another node in the same cluster. The ClusterContext is made up of ClusterNodeManagers which are responsible for setting up the connections between the EJBClient and the nodes in the cluster.
> Cluster topology updates are sent to registered EJBClients whenever the cluster topology changes (nodes join, nodes leave a cluster). Thse topology updates are processed on the client side by ClusterTopologyUpdateHandler and are used to update the current contents of the associated ClusterContext held on the client.
> The current implementation of the handling of topology updates does not correctly handle the addition/removal of ClusterNodeManagers from the cluster context - namely, rather than check whether or not a new ClusterNodeManager really needs to be added, ClusterNodeManagers are added for every node in the received topology update, leading to many unnecessary EJBReceivers and their channels being created.
> The logs show that up to 18 cluster node manager instances may be created, have their EJBReceivers registered and channels to the remote node created, when only one node has been added to the cluster.



--
This message was sent by Atlassian JIRA
(v6.2.6#6264)


More information about the jboss-jira mailing list