[jboss-jira] [JBoss JIRA] (WFLY-10392) Regression in Remote JCA scenario with JDBC store after Artemis upgrade

Fri May 18 06:35:00 EDT 2018

    [ https://issues.jboss.org/browse/WFLY-10392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13578825#comment-13578825 ] 

Erich Duda commented on WFLY-10392:
-----------------------------------

[~fnigro] I looked on the test case which fails because of duplicated messages. I found a place where the duplicates are created.

* Server 2 and server 4 are in cluster
* Server 4 has active consumer so server 2 resends messages to server 4
* Message A is sent by ClusterConnectionBridge to server 4
* Server 4 receives message, sends acknowledgment back to server 2 and then sends message to the receiver
* Server 2 receives the acknowledgment from the server 4 but in the same time it is restarted by test.
* The restart causes that redistribution of message A is canceled
* After server 2 is restarted, it sends message A again to server 4.

This behavior is correct. I compared it with working Artemis version and it behaved in the same way. The difference was later on server 4 side. Correct version of Artemis receives the message A on server 4, sends back acknowledgment and then it drops the message because it was already in duplication cache.

However the incorrect version of Artemis doesn't detect duplication and sends the message A again to the receiver. And the reason is that the {{_AMQ_BRIDGE_DUP}} of the message A is different before and after restart of server 2.

{code}
_AMQ_BRIDGE_DUP=[C656 A37B 5A6A 11E8 AFF0 FA16 3ECA B4BE 0000 0000 0000 039C)]]
_AMQ_BRIDGE_DUP=[E613 7C4A 5A6A 11E8 A102 FA16 3ECA B4BE 0000 0000 0000 039C)]]
{code}

It seems that writing and reading to/from database breaks integrity of the message.

> Regression in Remote JCA scenario with JDBC store after Artemis upgrade
> -----------------------------------------------------------------------
>
>                 Key: WFLY-10392
>                 URL: https://issues.jboss.org/browse/WFLY-10392
>             Project: WildFly
>          Issue Type: Bug
>          Components: JMS
>            Reporter: Erich Duda
>            Assignee: Jeff Mesnil
>            Priority: Critical
>
> After Artemis upgrade to 1.5.5.jbossorg-011 (WFLY-10139) I can see a regression in Remote JCA test case when JDBC persistent store is used. This issue is not present if Artemis file based journal is used.
> When I removed commits related to JDBC HA from Artemis upgrade, the test passed with both JDBC and file based store.
> *Remote JCA scenario:*
> * There are 4 Wildfly servers
> * Servers 1 and 3 are used as messaging brokers - they are called JMS servers
> * Servers 2 and 4 have MDBs and RA configured to connect to JMS servers. These servers are called MDB servers.
> * External standalone producer sends messages to server 1 to InQueue
> * MDBs on MBD severs receive messages from InQueue and send them to OutQueue
> * External standalone receiver receives messages from server 3 from OutQueue
> * During this scenario server 1 is several times killed and restarted.
> *Expectation:* All messages sent by the standalone producer are received by the standalone receiver. There are no lost or duplicated messages.
> *Reality:* After some kills of server 1, message flow coming from the standalone producer is broken and the receiver does not receive any messages in specified timeout.
> *Critical* priority was chosen because the regression has been detected only in this particular test case so far. We ran only nightly testing scope. After that we run full testing, we will know more about impact of this issue.
> *Technical details:*
> At some point following log messages start to occur in the test log. It seems that they are related to malfunction of communication among Artemis brokers what causes that message flow is broken.
> {code}
> 21:55:51,724 WARN  [org.apache.activemq.artemis.core.server] (Thread-1 (ActiveMQ-client-global-threads)) AMQ222139: MessageFlowRecordImpl [nodeID=e53da514-5953-11e8-910a-fa163e48a89a, connector=TransportConfigur
> ation(name=connector, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?httpUpgradeEndpoint=acceptor&activemqServerName=default&httpUpgradeEnabled=true&port=10080&host=rhel7-la
> rge-58597, queueName=sf.my-cluster.e53da514-5953-11e8-910a-fa163e48a89a, queue=QueueImpl[name=sf.my-cluster.e53da514-5953-11e8-910a-fa163e48a89a, postOffice=PostOfficeImpl [server=ActiveMQServerImpl::serverUUID=
> d80bf5bd-5953-11e8-ac10-fa163e48a89a]]@756a727a, isClosed=false, reset=true]::Remote queue binding jms.queue.DLQe53da514-5953-11e8-910a-fa163e48a89a has already been bound in the post office. Most likely cause f
> or this is you have a loop in your cluster due to cluster max-hops being too large or you have multiple cluster connections to the same nodes using overlapping addresses
> {code}
> {code}
> 21:55:53,078 ERROR [org.apache.activemq.artemis.core.server] (Thread-6 (ActiveMQ-client-global-threads)) AMQ224037: cluster connection Failed to handle message: java.lang.IllegalStateException: Cannot find binding for jms.queue.InQueuee53da514-5953-11e8-910a-fa163e48a89a on ClusterConnectionImpl at 14989488[nodeUUID=d80bf5bd-5953-11e8-ac10-fa163e48a89a, connector=TransportConfiguration(name=connector, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?httpUpgradeEndpoint=acceptor&activemqServerName=default&httpUpgradeEnabled=true&port=8080&host=rhel7-large-58597, address=jms, server=ActiveMQServerImpl::serverUUID=d80bf5bd-5953-11e8-ac10-fa163e48a89a]
>         at org.apache.activemq.artemis.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.doConsumerCreated(ClusterConnectionImpl.java:1294) [artemis-server-1.5.5.jbossorg-011.jar:1.5.5.jbossorg-011]
>         at org.apache.activemq.artemis.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.handleNotificationMessage(ClusterConnectionImpl.java:1029) [artemis-server-1.5.5.jbossorg-011.jar:1.5.5.jbossorg-011]
>         at org.apache.activemq.artemis.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.onMessage(ClusterConnectionImpl.java:1004) [artemis-server-1.5.5.jbossorg-011.jar:1.5.5.jbossorg-011]
>         at org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl.callOnMessage(ClientConsumerImpl.java:1001) [artemis-core-client-1.5.5.jbossorg-011.jar:1.5.5.jbossorg-011]
>         at org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl.access$400(ClientConsumerImpl.java:49) [artemis-core-client-1.5.5.jbossorg-011.jar:1.5.5.jbossorg-011]
>         at org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl$Runner.run(ClientConsumerImpl.java:1124) [artemis-core-client-1.5.5.jbossorg-011.jar:1.5.5.jbossorg-011]
>         at org.apache.activemq.artemis.utils.OrderedExecutorFactory$OrderedExecutor$ExecutorTask.run(OrderedExecutorFactory.java:122) [artemis-commons-1.5.5.jbossorg-011.jar:1.5.5.jbossorg-011]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [rt.jar:1.8.0_171]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [rt.jar:1.8.0_171]
>         at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_171]
> {code}

--
This message was sent by Atlassian JIRA
(v7.5.0#75005)