]
Michal Toth commented on WFWIP-11:
----------------------------------
Related issue is fixed. Who should close this item?
[Artemis upgrade] Regression in replicated HA tests
---------------------------------------------------
Key: WFWIP-11
URL:
https://issues.jboss.org/browse/WFWIP-11
Project: WildFly WIP
Issue Type: Bug
Components: Artemis
Reporter: Erich Duda
Assignee: Martyn Taylor
Priority: Blocker
Labels: feature-branch-blocker
*Scenario*
* There are two servers configured as replicated Live-Backup pair
* Live server is killed
* Test waits until Backup server activates
* Live server is restarted
* Test expects that Backup server deactivates and Live becomes active
*Reality:*
Sometimes happens that Live server doesn't become active. In the log I can see that
it was synchronized with Backup, but based on quorum vote, it was restarted as Backup.
*Customer impact:* Failback feature in replicated HA is unstable.
{code:title=Live}
12:26:51,011 INFO [org.apache.activemq.artemis.core.server] (AMQ119000: Activation for
server ActiveMQServerImpl::serverUUID=null) AMQ221109: Apache ActiveMQ Artemis Backup
Server version 2.5.0-SNAPSHOT [null] started, waiting live to fail before it gets active
12:26:51,013 INFO [org.apache.activemq.artemis.core.client] (AMQ119000: Activation for
server ActiveMQServerImpl::serverUUID=null) AMQ211002: Started EPOLL Netty Connector
version unknown to localhost:9080
12:26:52,344 INFO [org.apache.activemq.artemis.core.server] (Thread-6
(ActiveMQ-client-netty-threads)) AMQ221024: Backup server
ActiveMQServerImpl::serverUUID=a9e2f7d1-0742-11e8-bd92-54ee7553e6a7 is synchronized with
live-server.
12:26:52,386 INFO [org.apache.activemq.artemis.core.client] (Thread-5
(ActiveMQ-client-netty-threads)) AMQ211002: Started EPOLL Netty Connector version unknown
to localhost:9080
12:26:52,423 INFO [org.apache.activemq.artemis.core.server] (Thread-5
(ActiveMQ-client-netty-threads)) AMQ221070: Restarting as backup based on quorum vote
results.
{code}
{code:title=Backup}
12:26:51,435 INFO [org.apache.activemq.artemis.core.server] (Thread-152) AMQ221025:
Replication: sending
AIOSequentialFile:/home/eduda/Projects/messaging-testsuite/server2/jboss-eap/standalone/data/../../../../
hornetq-journal-B/journal/activemq-data-11.amq (size=10 485 760) to replica.
12:26:51,766 INFO [org.apache.activemq.artemis.core.server] (Thread-152) AMQ221025:
Replication: sending
AIOSequentialFile:/home/eduda/Projects/messaging-testsuite/server2/jboss-eap/standalone/data/../../../../
hornetq-journal-B/journal/activemq-data-4.amq (size=10 485 760) to replica.
12:26:51,884 INFO [org.apache.activemq.artemis.core.server] (Thread-152) AMQ221025:
Replication: sending NIOSequentialFile
/home/eduda/Projects/messaging-testsuite/server2/jboss-eap/standalone/data/../../../../
hornetq-journal-B/bindings/activemq-bindings-4.bindings (size=1 048 576) to replica.
12:26:51,890 INFO [org.apache.activemq.artemis.core.server] (Thread-152) AMQ221025:
Replication: sending NIOSequentialFile
/home/eduda/Projects/messaging-testsuite/server2/jboss-eap/standalone/data/../../../../
hornetq-journal-B/bindings/activemq-bindings-2.bindings (size=1 048 576) to replica.
12:26:52,388 WARN [org.apache.activemq.artemis.core.client] (Thread-4
(ActiveMQ-client-global-threads)) AMQ212037: Connection failure has been detected:
AMQ119015: The connection was disconnected because of ser
ver shutdown [code=DISCONNECTED]
12:26:52,391 WARN [org.jboss.activemq.artemis.wildfly.integration.recovery] (Thread-4
(ActiveMQ-client-global-threads)) being disconnected for server shutdown:
ActiveMQDisconnectedException[errorType=DISCONNECT
ED message=AMQ119015: The connection was disconnected because of server shutdown]
at
org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl$CloseRunnable.run(ClientSessionFactoryImpl.java:996)
[artemis-core-client-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
at
org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42)
[artemis-commons-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
at
org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31)
[artemis-commons-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
at
org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:66)
[artemis-commons-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[rt.jar:1.8.0_151]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[rt.jar:1.8.0_151]
at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_151]
12:26:52,396 WARN [org.apache.activemq.artemis.core.server] (default I/O-9) AMQ222061:
Client connection failed, clearing up resources for session
c12b0cb8-0742-11e8-8a8a-54ee7553e6a7
12:26:52,397 WARN [org.apache.activemq.artemis.core.server] (default I/O-9) AMQ222107:
Cleared up resources for session c12b0cb8-0742-11e8-8a8a-54ee7553e6a7
12:26:52,397 WARN [org.apache.activemq.artemis.core.server] (default I/O-9) AMQ222061:
Client connection failed, clearing up resources for session
c12da4cb-0742-11e8-8a8a-54ee7553e6a7
12:26:52,397 WARN [org.apache.activemq.artemis.core.server] (default I/O-9) AMQ222107:
Cleared up resources for session c12da4cb-0742-11e8-8a8a-54ee7553e6a7
12:26:52,412 INFO [org.wildfly.extension.messaging-activemq] (MSC service thread 1-1)
WFLYMSGAMQ0006: Unbound messaging object to jndi name
java:jboss/DefaultJMSConnectionFactory
12:26:52,412 INFO [org.jboss.as.connector.deployment] (MSC service thread 1-8)
WFLYJCA0011: Unbound JCA ConnectionFactory [java:/JmsXA]
12:26:52,412 INFO [org.wildfly.extension.messaging-activemq] (ServerService Thread Pool
-- 74) WFLYMSGAMQ0006: Unbound messaging object to jndi name
java:jboss/exported/jms/RemoteConnectionFactory
12:26:52,413 INFO [org.wildfly.extension.messaging-activemq] (ServerService Thread Pool
-- 107) WFLYMSGAMQ0006: Unbound messaging object to jndi name java:/ConnectionFactory
12:26:52,416 INFO [org.apache.activemq.artemis.ra] (ServerService Thread Pool -- 108)
AMQ151003: resource adaptor stopped
12:26:52,458 WARN [org.apache.activemq.artemis.core.server] (default I/O-13) AMQ222092:
Connection to the backup node failed, removing replication now:
ActiveMQRemoteDisconnectException[errorType=REMOTE_DISCONNECT message=null]
at
org.apache.activemq.artemis.core.remoting.server.impl.RemotingServiceImpl.connectionDestroyed(RemotingServiceImpl.java:553)
at
org.apache.activemq.artemis.core.remoting.impl.netty.NettyAcceptor$Listener.connectionDestroyed(NettyAcceptor.java:768)
at
org.apache.activemq.artemis.core.remoting.impl.netty.ActiveMQChannelHandler.channelInactive(ActiveMQChannelHandler.java:78)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
at
io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:377)
at
io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
at
io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1354)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at
io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:917)
at
io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822)
at org.xnio.nio.WorkerThread.safeRun(WorkerThread.java:612)
[xnio-nio-3.5.4.Final.jar:3.5.4.Final]
at org.xnio.nio.WorkerThread.run(WorkerThread.java:479)
[xnio-nio-3.5.4.Final.jar:3.5.4.Final]
{code}