[JBoss JIRA] (WFLY-6850) max-saved-replicated-journal-size is ignored
by Bartosz Baranowski (JIRA)
[ https://issues.jboss.org/browse/WFLY-6850?page=com.atlassian.jira.plugin.... ]
Bartosz Baranowski updated WFLY-6850:
-------------------------------------
Description: (was: *Scenario:*
# Configure two EAP servers in replicated HA topology
# On backup set {{max-saved-replicated-journal-size=2}}, {{allow-failback=true}} and {{restart-backup=true}}
# Do sequence: kill live, start live, kill live, start live...
*Expectation:* When number of saved replicated journals exceeds 2, the backup should not be restarted but stopped.
*Actual state:* {{max-saved-replicated-journal-size}} is ignored and backup is not stopped. Number of journals grows ad infinitum.
I think that root cause is in {{SharedNothingLiveActivation}}. In the condition \[1\] there should be OR between !isRestartBackup and check if the number of journals was exceeded.
\[1\]
{code}
//if we have to many backups kept or are not configured to restart just stop, otherwise restart as a backup
if (!replicatedPolicy.getReplicaPolicy().isRestartBackup() && activeMQServer.countNumberOfCopiedJournals() >= replicatedPolicy.getReplicaPolicy().getMaxSavedReplicatedJournalsSize() && replicatedPolicy.getReplicaPolicy().getMaxSavedReplicatedJournalsSize() >= 0) {
activeMQServer.stop(true);
ActiveMQServerLogger.LOGGER.stopReplicatedBackupAfterFailback();
}
else {
activeMQServer.stop(true);
ActiveMQServerLogger.LOGGER.restartingReplicatedBackupAfterFailback();
activeMQServer.setHAPolicy(replicatedPolicy.getReplicaPolicy());
activeMQServer.start();
}
{code})
> max-saved-replicated-journal-size is ignored
> --------------------------------------------
>
> Key: WFLY-6850
> URL: https://issues.jboss.org/browse/WFLY-6850
> Project: WildFly
> Issue Type: Bug
> Reporter: Bartosz Baranowski
> Assignee: Clebert Suconic
> Priority: Critical
> Labels: downstream_dependency
>
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 5 months
[JBoss JIRA] (WFLY-6849) Duplicate messages in replicated HA topology when backup is shutdowned
by Bartosz Baranowski (JIRA)
[ https://issues.jboss.org/browse/WFLY-6849?page=com.atlassian.jira.plugin.... ]
Bartosz Baranowski updated WFLY-6849:
-------------------------------------
Steps to Reproduce: (was: {code}
git clone git://git.app.eng.bos.redhat.com/jbossqe/eap-tests-hornetq.git
cd eap-tests-hornetq/scripts/
git checkout refactoring_modules
groovy -DEAP_ZIP_URL=http://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap7-artemis-prepare/lastSuccessfulBuild/artifact/jboss-eap-7.x.patched.zip PrepareServers7.groovy
export WORKSPACE=$PWD
export JBOSS_HOME_1=$WORKSPACE/server1/jboss-eap
export JBOSS_HOME_2=$WORKSPACE/server2/jboss-eap
export JBOSS_HOME_3=$WORKSPACE/server3/jboss-eap
export JBOSS_HOME_4=$WORKSPACE/server4/jboss-eap
cd ../jboss-hornetq-testsuite/
mvn clean test -Dtest=ReplicatedDedicatedFailoverTestCase#testDuplicatesAreDetectedWhenBackupIsCrashedAfterSynchronization -DfailIfNoTests=false -Deap=7x -Deap7.org.jboss.qa.hornetq.apps.clients.version=7.x-SNAPSHOT | tee log
{code})
> Duplicate messages in replicated HA topology when backup is shutdowned
> ----------------------------------------------------------------------
>
> Key: WFLY-6849
> URL: https://issues.jboss.org/browse/WFLY-6849
> Project: WildFly
> Issue Type: Bug
> Reporter: Bartosz Baranowski
> Assignee: Clebert Suconic
> Priority: Critical
> Labels: downstream_dependency
>
> Scenario
> # Configure 2 nodes in replicated dedicated topology
> # Start live (node-1) and backup (node-2)
> # Start producer
> # Shut down node-2
> # Stop producer
> # Check if there are some duplicates on node-1 using CLI operation list-messages
> Expectation: there is no duplications
> Actual state: there are 10 messages with the same _AMQ_DUPL_ID twice
> After that Backup is shut downed, the Live is not able to replicate its data to Backup. It waits 30 seconds until timeouts expire. Meanwhile producer does not get response from Live and it gets TimeoutException on commit. It retries to send the same batch of messages and then commit them again. I think that problem is at this point. The Live does not detect duplicate messages.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 5 months
[JBoss JIRA] (WFLY-6850) max-saved-replicated-journal-size is ignored
by Bartosz Baranowski (JIRA)
[ https://issues.jboss.org/browse/WFLY-6850?page=com.atlassian.jira.plugin.... ]
Bartosz Baranowski moved JBEAP-5263 to WFLY-6850:
-------------------------------------------------
Project: WildFly (was: JBoss Enterprise Application Platform)
Key: WFLY-6850 (was: JBEAP-5263)
Workflow: GIT Pull Request workflow (was: CDW with loose statuses v1)
Component/s: (was: ActiveMQ)
Affects Version/s: (was: 7.0.1.CR1)
> max-saved-replicated-journal-size is ignored
> --------------------------------------------
>
> Key: WFLY-6850
> URL: https://issues.jboss.org/browse/WFLY-6850
> Project: WildFly
> Issue Type: Bug
> Reporter: Bartosz Baranowski
> Assignee: Clebert Suconic
> Priority: Critical
> Labels: downstream_dependency
>
> *Scenario:*
> # Configure two EAP servers in replicated HA topology
> # On backup set {{max-saved-replicated-journal-size=2}}, {{allow-failback=true}} and {{restart-backup=true}}
> # Do sequence: kill live, start live, kill live, start live...
> *Expectation:* When number of saved replicated journals exceeds 2, the backup should not be restarted but stopped.
> *Actual state:* {{max-saved-replicated-journal-size}} is ignored and backup is not stopped. Number of journals grows ad infinitum.
> I think that root cause is in {{SharedNothingLiveActivation}}. In the condition \[1\] there should be OR between !isRestartBackup and check if the number of journals was exceeded.
> \[1\]
> {code}
> //if we have to many backups kept or are not configured to restart just stop, otherwise restart as a backup
> if (!replicatedPolicy.getReplicaPolicy().isRestartBackup() && activeMQServer.countNumberOfCopiedJournals() >= replicatedPolicy.getReplicaPolicy().getMaxSavedReplicatedJournalsSize() && replicatedPolicy.getReplicaPolicy().getMaxSavedReplicatedJournalsSize() >= 0) {
> activeMQServer.stop(true);
> ActiveMQServerLogger.LOGGER.stopReplicatedBackupAfterFailback();
> }
> else {
> activeMQServer.stop(true);
> ActiveMQServerLogger.LOGGER.restartingReplicatedBackupAfterFailback();
> activeMQServer.setHAPolicy(replicatedPolicy.getReplicaPolicy());
> activeMQServer.start();
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 5 months
[JBoss JIRA] (WFLY-6849) Duplicate messages in replicated HA topology when backup is shutdowned
by Bartosz Baranowski (JIRA)
[ https://issues.jboss.org/browse/WFLY-6849?page=com.atlassian.jira.plugin.... ]
Bartosz Baranowski moved JBEAP-5262 to WFLY-6849:
-------------------------------------------------
Project: WildFly (was: JBoss Enterprise Application Platform)
Key: WFLY-6849 (was: JBEAP-5262)
Workflow: GIT Pull Request workflow (was: CDW with loose statuses v1)
Component/s: (was: ActiveMQ)
Affects Version/s: (was: 7.0.1.CR1)
> Duplicate messages in replicated HA topology when backup is shutdowned
> ----------------------------------------------------------------------
>
> Key: WFLY-6849
> URL: https://issues.jboss.org/browse/WFLY-6849
> Project: WildFly
> Issue Type: Bug
> Reporter: Bartosz Baranowski
> Assignee: Clebert Suconic
> Priority: Critical
> Labels: downstream_dependency
>
> Scenario
> # Configure 2 nodes in replicated dedicated topology
> # Start live (node-1) and backup (node-2)
> # Start producer
> # Shut down node-2
> # Stop producer
> # Check if there are some duplicates on node-1 using CLI operation list-messages
> Expectation: there is no duplications
> Actual state: there are 10 messages with the same _AMQ_DUPL_ID twice
> After that Backup is shut downed, the Live is not able to replicate its data to Backup. It waits 30 seconds until timeouts expire. Meanwhile producer does not get response from Live and it gets TimeoutException on commit. It retries to send the same batch of messages and then commit them again. I think that problem is at this point. The Live does not detect duplicate messages.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 5 months
[JBoss JIRA] (JGRP-2090) JGRP000027: failed passing message up
by Bela Ban (JIRA)
[ https://issues.jboss.org/browse/JGRP-2090?page=com.atlassian.jira.plugin.... ]
Bela Ban commented on JGRP-2090:
--------------------------------
The change was done on purpose: all transports (UDP, TCP, TCP_NIO2, TUNNEL) etc have an ID of 75 (in 3.6) which is the ID assigned to TP.
The reason was that members which have TCP as transport can talk to members having TCP_NIO2 as transport.
Can you reproduce this error? And if so, how?
> JGRP000027: failed passing message up
> -------------------------------------
>
> Key: JGRP-2090
> URL: https://issues.jboss.org/browse/JGRP-2090
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.6.10
> Reporter: Will Wright
> Assignee: Bela Ban
> Fix For: 3.6.11, 4.0
>
>
> The JGroups library is generating NullPointerExceptions when attempting to process incoming messages. From what I can tell this is due to it looking for an incorrect message header.
> The TP object has a protocol id of 75 despite being a TUNNEL which ought to have an id of 24. The id is being correctly assigned to 24 at the Protocol.id member variable declaration, but is then subsequently changed to 75 by the TP.init method. This seems to have been caused by commit [6bc167f7e0181af32e1930935d8cf0efdc1e82f0|https://github.com/belaban/JGrou...] which has the message "Added TP to jg-protocols.xml". If I backout this change to the TP.init method so as to not update the id then my application receives the incoming messages fine.
> java.lang.NullPointerException: null
> at org.jgroups.protocols.TP$SingleMessageHandler.run(TP.java:1872)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 5 months