[jboss-jira] [JBoss JIRA] (WFLY-4748) Singleton service fails to start after repetitive cluster split with "Failed to reach quorum of 1"

Tomas Hofman (JIRA) issues at jboss.org
Fri Jun 5 04:23:02 EDT 2015


Tomas Hofman created WFLY-4748:
----------------------------------

             Summary: Singleton service fails to start after repetitive cluster split with "Failed to reach quorum of 1"
                 Key: WFLY-4748
                 URL: https://issues.jboss.org/browse/WFLY-4748
             Project: WildFly
          Issue Type: Bug
          Components: Clustering
    Affects Versions: 10.0.0.Alpha2
            Reporter: Tomas Hofman
            Assignee: Tomas Hofman


When cluster of two nodes with deployed singleton service (f.i. cluster-ha-singleton quickstart app) splits, merges, and splits again, one of the nodes fails to run the singleton service with error message "WFLYCLSV0006: Failed to reach *quorum of 1* for jboss.quickstart.ha.singleton.default2 service. No singleton master will be elected." - note the "quorum of 1".

This only happens after the second and other successive splits. After the first split both nodes execute the service correctly.

After analysis, it appears that nodes are never being added back to service providers cache upon cluster merge, because CacheServiceProviderRegistrationFactory#membershipChanged() is never called with 'merged' attribute set to 'true'.

I presume that call should come from ChannelCommandDispatcherFactory#viewAccepted():

public void viewAccepted(View view) {
    // ...
    for (Listener listener: this.listeners) {
        listener.membershipChanged(oldNodes, newNodes, view instanceof MergeView); 
    } 
}

This method gets called, but the problem is that the 'listeners' list is empty, so no listener is actually notified.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


More information about the jboss-jira mailing list