[jboss-jira] [JBoss JIRA] (WFLY-5979) Colocated Artemis backup does not failback

Jeff Mesnil (JIRA) issues at jboss.org
Tue Jan 12 09:21:00 EST 2016


Jeff Mesnil created WFLY-5979:
---------------------------------

             Summary: Colocated Artemis backup does not failback
                 Key: WFLY-5979
                 URL: https://issues.jboss.org/browse/WFLY-5979
             Project: WildFly
          Issue Type: Bug
          Components: JMS
    Affects Versions: 10.0.0.CR5
            Reporter: Jeff Mesnil
            Assignee: Jeff Mesnil


Use case:

* starts 2 WildFly servers with the standalone-full-ha.xml configuration and additional ha-policy for the messaging-activemq server:

{noformat}
                <replication-colocated request-backup="true">
                    <master check-for-live-server="true"/>
                </replication-colocated>
{noformat}

The default configuration for the colocated's slave is (allow-failback=true, restart-backup=true).

Scenario:
1. Starts the 2 servers
2. Kill server #1
=> server #2 must activate its backup
3. Restart server #1
=> server #1 checks for a live server
=> server #2 must failback and restart the server #1's backup
=> server #1 is the live server

Currently at step (3), the activated server on #2 does not failback, the server #1 is started as live server and both uses the same nodeID.

{noformat}
* start server #1

14:54:12,151 INFO  [org.apache.activemq.artemis.core.server] (ServerService Thread Pool -- 71) AMQ221007: Server is now live
14:54:12,151 INFO  [org.apache.activemq.artemis.core.server] (ServerService Thread Pool -- 71) AMQ221001: Apache ActiveMQ Artemis Message Broker version 1.1.0.wildfly-010 [nodeID=cd605092-b933-11e5-ba21-cb5e13c1ea67]

* Server #2

14:56:27,926 INFO  [org.apache.activemq.artemis.core.server] (ServerService Thread Pool -- 71) AMQ221007: Server is now live
14:56:27,927 INFO  [org.apache.activemq.artemis.core.server] (ServerService Thread Pool -- 71) AMQ221001: Apache ActiveMQ Artemis Message Broker version 1.1.0.wildfly-010 [nodeID=198586c5-b933-11e5-a199-4dbe14260a82]
...
14:56:32,245 INFO  [org.apache.activemq.artemis.core.server] (default I/O-5) AMQ221049: Activating Replica for node: cd605092-b933-11e5-ba21-cb5e13c1ea67

* Server #1 also creates a replica for server #2

14:56:32,969 INFO  [org.apache.activemq.artemis.core.server] (default I/O-13) AMQ221049: Activating Replica for node: 198586c5-b933-11e5-a199-4dbe14260a82
...
14:56:32,738 INFO  [org.apache.activemq.artemis.core.server] (Thread-7 (ActiveMQ-client-netty-threads-1157501840)) AMQ221024: Backup server ActiveMQServerImpl::
serverUUID=cd605092-b933-11e5-ba21-cb5e13c1ea67 is synchronized with live-server.

* Kill server #1 -> colocated backup on server #2 becomes live

14:57:21,718 INFO  [org.apache.activemq.artemis.core.server] (AMQ119000: Activation for server ActiveMQServerImpl::serverUUID=null) AMQ221037: ActiveMQServerImpl::serverUUID=cd605092-b933-11e5-ba21-cb5e13c1ea67 to become 'live'

* Restart server #1

15:12:05,755 INFO  [org.apache.activemq.artemis.core.server] (ServerService Thread Pool -- 71) AMQ221001: Apache ActiveMQ Artemis Message Broker version 1.1.0.wildfly-010 [nodeID=cd605092-b933-11e5-ba21-cb5e13c1ea67]

* At this stage both server #1 and #2 behave like live servers for cd605092

4:58:09,893 WARN  [org.apache.activemq.artemis.core.client] (activemq-discovery-group-thread-dg-group1) AMQ212034: There are more than one servers on the network broadcasting the same node id. You will see this message exactly once (per node) if a node is restarted, in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This could occur if you have
a backup node active at the same time as its live node. nodeID=cd605092-b933-11e5-ba21-cb5e13c1ea67
{noformat}





--
This message was sent by Atlassian JIRA
(v6.4.11#64026)


More information about the jboss-jira mailing list