]
Jeff Mesnil commented on WFLY-5979:
-----------------------------------
the issue still occurs with Artemis 1.1.0.wildfly-015
Colocated Artemis backup does not failback
------------------------------------------
Key: WFLY-5979
URL:
https://issues.jboss.org/browse/WFLY-5979
Project: WildFly
Issue Type: Bug
Components: JMS
Affects Versions: 10.0.0.CR5
Reporter: Jeff Mesnil
Assignee: Jeff Mesnil
Use case:
* starts 2 WildFly servers with the standalone-full-ha.xml configuration and additional
ha-policy for the messaging-activemq server:
{noformat}
<replication-colocated request-backup="true">
<master check-for-live-server="true"/>
</replication-colocated>
{noformat}
The default configuration for the colocated's slave is (allow-failback=true,
restart-backup=true).
Scenario:
1. Starts the 2 servers
2. Kill server #1
=> server #2 must activate its backup
3. Restart server #1
=> server #1 checks for a live server
=> server #2 must failback and restart the server #1's backup
=> server #1 is the live server
Currently at step (3), the activated server on #2 does not failback, the server #1 is
started as live server and both uses the same nodeID.
{noformat}
* start server #1
14:54:12,151 INFO [org.apache.activemq.artemis.core.server] (ServerService Thread Pool
-- 71) AMQ221007: Server is now live
14:54:12,151 INFO [org.apache.activemq.artemis.core.server] (ServerService Thread Pool
-- 71) AMQ221001: Apache ActiveMQ Artemis Message Broker version 1.1.0.wildfly-010
[nodeID=cd605092-b933-11e5-ba21-cb5e13c1ea67]
* Server #2
14:56:27,926 INFO [org.apache.activemq.artemis.core.server] (ServerService Thread Pool
-- 71) AMQ221007: Server is now live
14:56:27,927 INFO [org.apache.activemq.artemis.core.server] (ServerService Thread Pool
-- 71) AMQ221001: Apache ActiveMQ Artemis Message Broker version 1.1.0.wildfly-010
[nodeID=198586c5-b933-11e5-a199-4dbe14260a82]
...
14:56:32,245 INFO [org.apache.activemq.artemis.core.server] (default I/O-5) AMQ221049:
Activating Replica for node: cd605092-b933-11e5-ba21-cb5e13c1ea67
* Server #1 also creates a replica for server #2
14:56:32,969 INFO [org.apache.activemq.artemis.core.server] (default I/O-13) AMQ221049:
Activating Replica for node: 198586c5-b933-11e5-a199-4dbe14260a82
...
14:56:32,738 INFO [org.apache.activemq.artemis.core.server] (Thread-7
(ActiveMQ-client-netty-threads-1157501840)) AMQ221024: Backup server ActiveMQServerImpl::
serverUUID=cd605092-b933-11e5-ba21-cb5e13c1ea67 is synchronized with live-server.
* Kill server #1 -> colocated backup on server #2 becomes live
14:57:21,718 INFO [org.apache.activemq.artemis.core.server] (AMQ119000: Activation for
server ActiveMQServerImpl::serverUUID=null) AMQ221037:
ActiveMQServerImpl::serverUUID=cd605092-b933-11e5-ba21-cb5e13c1ea67 to become
'live'
* Restart server #1
15:12:05,755 INFO [org.apache.activemq.artemis.core.server] (ServerService Thread Pool
-- 71) AMQ221001: Apache ActiveMQ Artemis Message Broker version 1.1.0.wildfly-010
[nodeID=cd605092-b933-11e5-ba21-cb5e13c1ea67]
* At this stage both server #1 and #2 behave like live servers for cd605092
4:58:09,893 WARN [org.apache.activemq.artemis.core.client]
(activemq-discovery-group-thread-dg-group1) AMQ212034: There are more than one servers on
the network broadcasting the same node id. You will see this message exactly once (per
node) if a node is restarted, in which case it can be safely ignored. But if it is logged
continuously it means you really do have more than one node on the same network active
concurrently with the same node id. This could occur if you have
a backup node active at the same time as its live node.
nodeID=cd605092-b933-11e5-ba21-cb5e13c1ea67
{noformat}