[jboss-jira] [JBoss JIRA] (WFWIP-13) Regression in cluster tests with network failures
Martyn Taylor (JIRA)
issues at jboss.org
Thu May 3 08:07:06 EDT 2018
[ https://issues.jboss.org/browse/WFWIP-13?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Martyn Taylor moved JBEAP-14165 to WFWIP-13:
--------------------------------------------
Project: WildFly WIP (was: JBoss Enterprise Application Platform)
Key: WFWIP-13 (was: JBEAP-14165)
Workflow: GIT Pull Request workflow (was: CDW with loose statuses v1)
Component/s: Artemis
(was: ActiveMQ)
Target Release: (was: 7.2.0.GA)
Affects Version/s: (was: 7.2.0.GA)
Affects Testing: (was: Regression)
> Regression in cluster tests with network failures
> -------------------------------------------------
>
> Key: WFWIP-13
> URL: https://issues.jboss.org/browse/WFWIP-13
> Project: WildFly WIP
> Issue Type: Bug
> Components: Artemis
> Reporter: Erich Duda
> Assignee: Yong Hao Gao
> Priority: Blocker
> Labels: feature-branch-blocker
>
> *Scenario*
> * There are two Artemis brokers configured to form cluster
> * There is a producer sending messages to broker 1 and receiver receiving messages from broker 2
> * Between the brokers there is a proxy which simulates network failure
> * The proxy is several times stopped and restarted to simulate the network failure
> * The test expects that all messages sent to broker 1 will be received by receiver from broker 2 (despite the network failures)
> *Reality:* After the proxy is stopped and restarted, the cluster is not able to form again. Both brokers try to reconnect to their opposites but with no luck.
> *Customer scenario:* Messaging cluster is not able to recover after network failures.
> *Investigation of issue*
> I investigated why brokers are not able to reconnect and I found out that always when they try to reconnect, they give it up because there is no topology record for {{nodeId}} where they try to connect. So the re-connection attempt ends here \[1\].
> I compared the behavior with Artemis 1.x and I found out that Artemis 2.x removes the topology member when connection failure is detected, but Artemis 1.x doesn't. When I commented the line \[2\] it fixed the issue. This line is not present in 1.x.
> \[1\] https://github.com/apache/activemq-artemis/blob/b66d0f7ac40001cce14ca7146e74720504ff9eb1/artemis-core-client/src/main/java/org/apache/activemq/artemis/core/client/impl/ServerLocatorImpl.java#L659
> \[2\] https://github.com/apache/activemq-artemis/blob/b66d0f7ac40001cce14ca7146e74720504ff9eb1/artemis-server/src/main/java/org/apache/activemq/artemis/core/server/cluster/impl/BridgeImpl.java#L782
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
More information about the jboss-jira
mailing list