[JBoss JIRA] (WFLY-12317) Using JTA transaction's node_name attribute is set to an old value after node-identifier is changed
by Ondrej Chaloupka (Jira)
[ https://issues.jboss.org/browse/WFLY-12317?page=com.atlassian.jira.plugin... ]
Ondrej Chaloupka edited comment on WFLY-12317 at 8/1/19 10:19 AM:
------------------------------------------------------------------
Summation of effort on this issue.
* The cause of the issue is described in the Mike's comment above (https://issues.jboss.org/browse/WFLY-12317?focusedCommentId=13763985#comm...).
* Even the test worked before with {{:reload}} it was not correct and the correct behaviour needed (even in WFLY16) restart. The counterexample is - start WFLY with node-id {{1}}. Fill the object store with some transaction data (orphan detection needs to be triggered). Wait a 2 minutes for recovery to start. Try to change the node-id to {{2}} with cli operation and reload. Even the model and jboss cli shows the node-id is {{2}} Narayana still effectivelly uses the node-id with value {{1}}. To fix this restart of the JVM is needed.
* Question is what is the right fix. There is a discussion at the forum https://developer.jboss.org/thread/280430 what is the right fix. I personally incline to permit to change the node-id just with reload (the fix would be then like this https://github.com/ochaloup/wildfly/commit/c39536f621dd3be38fdc605cd3b884...). The other option is to enforce JVM restart (the fix would be then like this https://github.com/ochaloup/wildfly/commit/42d098a41e1f9e9854d080491849aa...).
So, for the test this means that it's wrongly written for WFLY16. It's not the problem of the test as it follows the attribute description in the model. But as the model description presents a wrong flag (reload requires instead of necessary restart requires) it's the issue of the model that the flag was set wrongly. (The fix for this issue can allow only the reload.)
was (Author: ochaloup):
Summation of effort on this issue.
* The cause of the issue is described in the Mike's comment above (https://issues.jboss.org/browse/WFLY-12317?focusedCommentId=13763985#comm...).
* Even the test worked before with {{:reload}} it was not correct and the correct behaviour needed (even in WFLY16) restart. The counterexample is - start WFLY with node-id {{1}}. Fill the object store with some transaction data (orphan detection needs to be triggered). Wait a 2 minutes for recovery to start. Try to change the node-id to {{2}} with cli operation and reload. Even the model and jboss cli shows the node-id is {{2}} Narayana still effectivelly uses the node-id with value {{1}}. To fix this restart of the JVM is needed.
* Question is what is the right fix. There is a discussion at the forum https://developer.jboss.org/thread/280430 what is the right fix. I personally inclined to permit to change the node-id just with reload (the fix would be then like this https://github.com/ochaloup/wildfly/commit/c39536f621dd3be38fdc605cd3b884...). The other option is to enforce JVM restart (the fix would be then like this https://github.com/ochaloup/wildfly/commit/42d098a41e1f9e9854d080491849aa...).
So, for the test this means that it's wrongly written for WFLY16. It's not the problem of the test as it follows the attribute description in the model. But as the model description presents a wrong flag (reload requires instead of necessary restart requires) it's the issue of the model that the flag was set wrongly. (The fix for this issue can allow only the reload.)
> Using JTA transaction's node_name attribute is set to an old value after node-identifier is changed
> ---------------------------------------------------------------------------------------------------
>
> Key: WFLY-12317
> URL: https://issues.jboss.org/browse/WFLY-12317
> Project: WildFly
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 17.0.0.Final
> Reporter: Ivan Straka
> Assignee: Michael Musgrove
> Priority: Critical
> Attachments: server1_TxDifferentNodeCrashRecoveryTestCase_prepareHalt_jta_server.log, server2_TxDifferentNodeCrashRecoveryTestCase_prepareHalt_jta_server.log
>
>
> We have following test scenario (3 servers) that fails:
> # node-identifier of server1, 2 & 3 is set to 'vkcd', 'FdOu' and 'GocW' (ts.jbosstsX.node.identifier property)
> # server2 is started, node-identifier is set to txdifferentnodeid and server2 is stopped
> # server1 is started, node-identifier is set to txdifferentnodeid and server1 is reloaded
> # server3 is running
> # client call an EJB bean (where a transaction is started) on the server1
> # the EJB sends JMS message to the server3 (broker)
> # the EJB enlists dummy xa resource
> # during 2PC the Server1 is halted when prepare on dummy xa resource is invoked
> # we move server1 object store directory to the server2
> # server2 is started
> # the server2 is expected to rollback whole transaction
> Transaction is unfinished because server2 has not performed rollback.
> {code:java}
> prepareHalt(org.jboss.as.test.jbossts.crashrec.differentnode.test.TxDifferentNodeCrashRecoveryTestCase) Time elapsed: 810.354 sec <<< FAILURE!
> java.lang.AssertionError: Some unfinished xids on messaging server - expected 0 but was 1
> at org.junit.Assert.fail(Assert.java:88)
> at org.jboss.as.test.jbossts.crashrec.differentnode.test.TxDifferentNodeCrashRecoveryTestCase.checkAfterTestExecution(TxDifferentNodeCrashRecoveryTestCase.java:792)
> at org.jboss.as.test.jbossts.crashrec.differentnode.test.TxDifferentNodeCrashRecoveryTestCase.prepareHalt(TxDifferentNodeCrashRecoveryTestCase.java:565)
> {code}
> In the beginning servers' node-identifier are set to some value (lets say A,B,C). Before test execution node-identifier of server1 and server2 is set to the same value, let's say X.
> I see in logs that the transaction's node_name is set to the old value (vkcd vs txdifferentnodeid in the example below) on server1. Thus the server2 has not performed rollback.
> See node_name
> Server1:
> {code:java}
> 2019-07-22 17:40:54,616 DEBUG [com.arjuna.ats.jta] (MSC service thread 1-5) Setting up node identifiers '[txdifferentnodeid]' for which recovery will be performed
> {code}
> {code:java}
> 2019-07-22 17:41:11,931 TRACE [com.arjuna.ats.jta] (default task-2) XAResourceRecord.XAResourceRecord ( < formatId=131077, gtrid_length=32, bqual_length=36, tx_uid=0:ffff0a2804ed:26165251:5d35d902:3c, node_name=vkcd, branch_uid=0:ffff0a2804ed:26165251:5d35d902:46, subordinatenodename=null, eis_name=java:/JmsXA NodeId:05b492ae-ac97-11e9-a446-2016b912eaa8 >, XAResourceWrapperImpl@4158c7ec[xaResource=org.jboss.activemq.artemis.wildfly.integration.WildFlyActiveMQXAResourceWrapper(a)4a21a45f pad=false overrideRmValue=null productName=ActiveMQ Artemis productVersion=2.0 jndiName=java:/JmsXA NodeId:05b492ae-ac97-11e9-a446-2016b912eaa8] ), record id=0:ffff0a2804ed:26165251:5d35d902:47
> {code}
> Server2:
>
> {code:java}
> 2019-07-22 17:41:15,397 DEBUG [com.arjuna.ats.jta] (MSC service thread 1-3) Setting up node identifiers '[txdifferentnodeid]' for which recovery will be performed
> {code}
> {code:java}
> 2019-07-22 17:43:56,062 DEBUG [com.arjuna.ats.jta] (Periodic Recovery) node name of < formatId=131077, gtrid_length=32, bqual_length=36, tx_uid=0:ffff0a2804ed:26165251:5d35d902:3c, node_name=vkcd, branch_uid=0:ffff0a2804ed:26165251:5d35d902:46, subordinatenodename=null, eis_name=forgot eis name for: 1 > is vkcd
> 2019-07-22 17:43:56,062 DEBUG [com.arjuna.ats.jta] (Periodic Recovery) XAResourceOrphanFilter com.arjuna.ats.internal.jta.recovery.arjunacore.JTANodeNameXAResourceOrphanFilter voted ABSTAIN
> {code}
> *When does the scenario pass*
> When I run the TS with
> {code:java}
> -Dts.jbossts1.node.identifier=txdifferentnodeid -Dts.jbossts2.node.identifier=txdifferentnodeid
> {code}
> the test passes (old and new node-identifier on both servers are same)
> When step 3 slightly differs:
> When restart is performed instead of reload op.
> Server1 is reloaded. If it is restarted, node name is set correctly to txdifferentnodeid
> *tldr;*
> The problem is that server1 set TX node name to old value after node identifier is changed and server is reloaded. If the server is restarted, everything is OK.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 9 months
[JBoss JIRA] (WFLY-12317) Using JTA transaction's node_name attribute is set to an old value after node-identifier is changed
by Ondrej Chaloupka (Jira)
[ https://issues.jboss.org/browse/WFLY-12317?page=com.atlassian.jira.plugin... ]
Ondrej Chaloupka edited comment on WFLY-12317 at 8/1/19 10:18 AM:
------------------------------------------------------------------
Summation of effort on this issue.
* The cause of the issue is described in the Mike's comment above (https://issues.jboss.org/browse/WFLY-12317?focusedCommentId=13763985#comm...).
* Even the test worked before with {{:reload}} it was not correct and the correct behaviour needed (even in WFLY16) restart. The counterexample is - start WFLY with node-id {{1}}. Fill the object store with some transaction data (orphan detection needs to be triggered). Wait a 2 minutes for recovery to start. Try to change the node-id to {{2}} with cli operation and reload. Even the model and jboss cli shows the node-id is {{2}} Narayana still effectivelly uses the node-id with value {{1}}. To fix this restart of the JVM is needed.
* Question is what is the right fix. There is a discussion at the forum https://developer.jboss.org/thread/280430 what is the right fix. I personally inclined to permit to change the node-id just with reload (the fix would be then like this https://github.com/ochaloup/wildfly/commit/c39536f621dd3be38fdc605cd3b884...). The other option is to enforce JVM restart (the fix would be then like this https://github.com/ochaloup/wildfly/commit/42d098a41e1f9e9854d080491849aa...).
So, for the test this means that it's wrongly written for WFLY16. It's not the problem of the test as it follows the attribute description in the model. But as the model description presents a wrong flag (reload requires instead of necessary restart requires) it's the issue of the model that the flag was set wrongly. (The fix for this issue can allow only the reload.)
was (Author: ochaloup):
Summation of effort on this issue.
* The cause of the issue is described in the Mike's comment above (https://issues.jboss.org/browse/WFLY-12317?focusedCommentId=13763985#comm...).
* Even the test worked before with {{:reload}} it was not correct and the correct behaviour needed (even in WFLY16) restart. The counterexample is - start WFLY with node-id {{1}}. Fill the object store with some transaction data (orphan detection needs to be triggered). Wait a 2 minutes for recovery to start. Try to change the node-id to {{2}} with cli operation and reload. Even the model and jboss cli shows the node-id is {{2}} Narayana still effectivelly uses the node-id with value {{1}}. To fix this restart of the JVM is needed.
* Question is what is the right fix. There is a discussion at the forum https://developer.jboss.org/thread/280430 what is the right fix. I personally inclined to permit to change the node-id just with reload (the fix would be then like this https://github.com/ochaloup/wildfly/commit/c39536f621dd3be38fdc605cd3b884...). The other option is to enforce JVM reload (the fix would be then like this https://github.com/ochaloup/wildfly/commit/42d098a41e1f9e9854d080491849aa...).
So, for the test this means that it's wrongly written for WFLY16. It's not the problem of the test as it follows the attribute description in the model. But as the model description presents a wrong flag (reload requires instead of necessary restart requires) it's the issue of the model that the flag was set wrongly. (The fix for this issue can allow only the reload.)
> Using JTA transaction's node_name attribute is set to an old value after node-identifier is changed
> ---------------------------------------------------------------------------------------------------
>
> Key: WFLY-12317
> URL: https://issues.jboss.org/browse/WFLY-12317
> Project: WildFly
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 17.0.0.Final
> Reporter: Ivan Straka
> Assignee: Michael Musgrove
> Priority: Critical
> Attachments: server1_TxDifferentNodeCrashRecoveryTestCase_prepareHalt_jta_server.log, server2_TxDifferentNodeCrashRecoveryTestCase_prepareHalt_jta_server.log
>
>
> We have following test scenario (3 servers) that fails:
> # node-identifier of server1, 2 & 3 is set to 'vkcd', 'FdOu' and 'GocW' (ts.jbosstsX.node.identifier property)
> # server2 is started, node-identifier is set to txdifferentnodeid and server2 is stopped
> # server1 is started, node-identifier is set to txdifferentnodeid and server1 is reloaded
> # server3 is running
> # client call an EJB bean (where a transaction is started) on the server1
> # the EJB sends JMS message to the server3 (broker)
> # the EJB enlists dummy xa resource
> # during 2PC the Server1 is halted when prepare on dummy xa resource is invoked
> # we move server1 object store directory to the server2
> # server2 is started
> # the server2 is expected to rollback whole transaction
> Transaction is unfinished because server2 has not performed rollback.
> {code:java}
> prepareHalt(org.jboss.as.test.jbossts.crashrec.differentnode.test.TxDifferentNodeCrashRecoveryTestCase) Time elapsed: 810.354 sec <<< FAILURE!
> java.lang.AssertionError: Some unfinished xids on messaging server - expected 0 but was 1
> at org.junit.Assert.fail(Assert.java:88)
> at org.jboss.as.test.jbossts.crashrec.differentnode.test.TxDifferentNodeCrashRecoveryTestCase.checkAfterTestExecution(TxDifferentNodeCrashRecoveryTestCase.java:792)
> at org.jboss.as.test.jbossts.crashrec.differentnode.test.TxDifferentNodeCrashRecoveryTestCase.prepareHalt(TxDifferentNodeCrashRecoveryTestCase.java:565)
> {code}
> In the beginning servers' node-identifier are set to some value (lets say A,B,C). Before test execution node-identifier of server1 and server2 is set to the same value, let's say X.
> I see in logs that the transaction's node_name is set to the old value (vkcd vs txdifferentnodeid in the example below) on server1. Thus the server2 has not performed rollback.
> See node_name
> Server1:
> {code:java}
> 2019-07-22 17:40:54,616 DEBUG [com.arjuna.ats.jta] (MSC service thread 1-5) Setting up node identifiers '[txdifferentnodeid]' for which recovery will be performed
> {code}
> {code:java}
> 2019-07-22 17:41:11,931 TRACE [com.arjuna.ats.jta] (default task-2) XAResourceRecord.XAResourceRecord ( < formatId=131077, gtrid_length=32, bqual_length=36, tx_uid=0:ffff0a2804ed:26165251:5d35d902:3c, node_name=vkcd, branch_uid=0:ffff0a2804ed:26165251:5d35d902:46, subordinatenodename=null, eis_name=java:/JmsXA NodeId:05b492ae-ac97-11e9-a446-2016b912eaa8 >, XAResourceWrapperImpl@4158c7ec[xaResource=org.jboss.activemq.artemis.wildfly.integration.WildFlyActiveMQXAResourceWrapper(a)4a21a45f pad=false overrideRmValue=null productName=ActiveMQ Artemis productVersion=2.0 jndiName=java:/JmsXA NodeId:05b492ae-ac97-11e9-a446-2016b912eaa8] ), record id=0:ffff0a2804ed:26165251:5d35d902:47
> {code}
> Server2:
>
> {code:java}
> 2019-07-22 17:41:15,397 DEBUG [com.arjuna.ats.jta] (MSC service thread 1-3) Setting up node identifiers '[txdifferentnodeid]' for which recovery will be performed
> {code}
> {code:java}
> 2019-07-22 17:43:56,062 DEBUG [com.arjuna.ats.jta] (Periodic Recovery) node name of < formatId=131077, gtrid_length=32, bqual_length=36, tx_uid=0:ffff0a2804ed:26165251:5d35d902:3c, node_name=vkcd, branch_uid=0:ffff0a2804ed:26165251:5d35d902:46, subordinatenodename=null, eis_name=forgot eis name for: 1 > is vkcd
> 2019-07-22 17:43:56,062 DEBUG [com.arjuna.ats.jta] (Periodic Recovery) XAResourceOrphanFilter com.arjuna.ats.internal.jta.recovery.arjunacore.JTANodeNameXAResourceOrphanFilter voted ABSTAIN
> {code}
> *When does the scenario pass*
> When I run the TS with
> {code:java}
> -Dts.jbossts1.node.identifier=txdifferentnodeid -Dts.jbossts2.node.identifier=txdifferentnodeid
> {code}
> the test passes (old and new node-identifier on both servers are same)
> When step 3 slightly differs:
> When restart is performed instead of reload op.
> Server1 is reloaded. If it is restarted, node name is set correctly to txdifferentnodeid
> *tldr;*
> The problem is that server1 set TX node name to old value after node identifier is changed and server is reloaded. If the server is restarted, everything is OK.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 9 months
[JBoss JIRA] (WFLY-12317) Using JTA transaction's node_name attribute is set to an old value after node-identifier is changed
by Ondrej Chaloupka (Jira)
[ https://issues.jboss.org/browse/WFLY-12317?page=com.atlassian.jira.plugin... ]
Ondrej Chaloupka edited comment on WFLY-12317 at 8/1/19 10:18 AM:
------------------------------------------------------------------
Summation of effort on this issue.
* The cause of the issue is described in the Mike's comment above (https://issues.jboss.org/browse/WFLY-12317?focusedCommentId=13763985#comm...).
* Even the test worked before with {{:reload}} it was not correct and the correct behaviour needed (even in WFLY16) restart. The counterexample is - start WFLY with node-id {{1}}. Fill the object store with some transaction data (orphan detection needs to be triggered). Wait a 2 minutes for recovery to start. Try to change the node-id to {{2}} with cli operation and reload. Even the model and jboss cli shows the node-id is {{2}} Narayana still effectivelly uses the node-id with value {{1}}. To fix this restart of the JVM is needed.
* Question is what is the right fix. There is a discussion at the forum https://developer.jboss.org/thread/280430 what is the right fix. I personally inclined to permit to change the node-id just with reload (the fix would be then like this https://github.com/ochaloup/wildfly/commit/c39536f621dd3be38fdc605cd3b884...). The other option is to enforce JVM reload (the fix would be then like this https://github.com/ochaloup/wildfly/commit/42d098a41e1f9e9854d080491849aa...).
So, for the test this means that it's wrongly written for WFLY16. It's not the problem of the test as it follows the attribute description in the model. But as the model description presents a wrong flag (reload requires instead of necessary restart requires) it's the issue of the model that the flag was set wrongly. (The fix for this issue can allow only the reload.)
was (Author: ochaloup):
Summation of effort on this issue.
* The cause of the issue is described in the Mike's comment above (https://issues.jboss.org/browse/WFLY-12317?focusedCommentId=13763985#comm...).
* Even the test worked before with {{:reload}} it was not correct and the correct behaviour needed (even in WFLY16) restart. The counterexample is - start WFLY with node-id {{1}}. Fill the object store with some transaction data (orphan detection needs to be triggered). Wait a 2 minutes for recovery to start. Try to change the node-id to {{2}} with cli operation and reload. Even the model and jboss cli shows the node-id is {{2}} Narayana still effectivelly uses the node-ide with value {{1}}. To fix this restart of the JVM is needed.
* Question is what is the right fix. There is a discussion at the forum https://developer.jboss.org/thread/280430 what is the right fix. I personally inclined to permit to change the node-id just with reload (the fix would be then like this https://github.com/ochaloup/wildfly/commit/c39536f621dd3be38fdc605cd3b884...). The other option is to enforce JVM reload (the fix would be then like this https://github.com/ochaloup/wildfly/commit/42d098a41e1f9e9854d080491849aa...).
So, for the test this means that it's wrongly written for WFLY16. It's not the problem of the test as it follows the attribute description in the model. But as the model description presents a wrong flag (reload requires instead of necessary restart requires) it's the issue of the model that the flag was set wrongly. (The fix for this issue can allow only the reload.)
> Using JTA transaction's node_name attribute is set to an old value after node-identifier is changed
> ---------------------------------------------------------------------------------------------------
>
> Key: WFLY-12317
> URL: https://issues.jboss.org/browse/WFLY-12317
> Project: WildFly
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 17.0.0.Final
> Reporter: Ivan Straka
> Assignee: Michael Musgrove
> Priority: Critical
> Attachments: server1_TxDifferentNodeCrashRecoveryTestCase_prepareHalt_jta_server.log, server2_TxDifferentNodeCrashRecoveryTestCase_prepareHalt_jta_server.log
>
>
> We have following test scenario (3 servers) that fails:
> # node-identifier of server1, 2 & 3 is set to 'vkcd', 'FdOu' and 'GocW' (ts.jbosstsX.node.identifier property)
> # server2 is started, node-identifier is set to txdifferentnodeid and server2 is stopped
> # server1 is started, node-identifier is set to txdifferentnodeid and server1 is reloaded
> # server3 is running
> # client call an EJB bean (where a transaction is started) on the server1
> # the EJB sends JMS message to the server3 (broker)
> # the EJB enlists dummy xa resource
> # during 2PC the Server1 is halted when prepare on dummy xa resource is invoked
> # we move server1 object store directory to the server2
> # server2 is started
> # the server2 is expected to rollback whole transaction
> Transaction is unfinished because server2 has not performed rollback.
> {code:java}
> prepareHalt(org.jboss.as.test.jbossts.crashrec.differentnode.test.TxDifferentNodeCrashRecoveryTestCase) Time elapsed: 810.354 sec <<< FAILURE!
> java.lang.AssertionError: Some unfinished xids on messaging server - expected 0 but was 1
> at org.junit.Assert.fail(Assert.java:88)
> at org.jboss.as.test.jbossts.crashrec.differentnode.test.TxDifferentNodeCrashRecoveryTestCase.checkAfterTestExecution(TxDifferentNodeCrashRecoveryTestCase.java:792)
> at org.jboss.as.test.jbossts.crashrec.differentnode.test.TxDifferentNodeCrashRecoveryTestCase.prepareHalt(TxDifferentNodeCrashRecoveryTestCase.java:565)
> {code}
> In the beginning servers' node-identifier are set to some value (lets say A,B,C). Before test execution node-identifier of server1 and server2 is set to the same value, let's say X.
> I see in logs that the transaction's node_name is set to the old value (vkcd vs txdifferentnodeid in the example below) on server1. Thus the server2 has not performed rollback.
> See node_name
> Server1:
> {code:java}
> 2019-07-22 17:40:54,616 DEBUG [com.arjuna.ats.jta] (MSC service thread 1-5) Setting up node identifiers '[txdifferentnodeid]' for which recovery will be performed
> {code}
> {code:java}
> 2019-07-22 17:41:11,931 TRACE [com.arjuna.ats.jta] (default task-2) XAResourceRecord.XAResourceRecord ( < formatId=131077, gtrid_length=32, bqual_length=36, tx_uid=0:ffff0a2804ed:26165251:5d35d902:3c, node_name=vkcd, branch_uid=0:ffff0a2804ed:26165251:5d35d902:46, subordinatenodename=null, eis_name=java:/JmsXA NodeId:05b492ae-ac97-11e9-a446-2016b912eaa8 >, XAResourceWrapperImpl@4158c7ec[xaResource=org.jboss.activemq.artemis.wildfly.integration.WildFlyActiveMQXAResourceWrapper(a)4a21a45f pad=false overrideRmValue=null productName=ActiveMQ Artemis productVersion=2.0 jndiName=java:/JmsXA NodeId:05b492ae-ac97-11e9-a446-2016b912eaa8] ), record id=0:ffff0a2804ed:26165251:5d35d902:47
> {code}
> Server2:
>
> {code:java}
> 2019-07-22 17:41:15,397 DEBUG [com.arjuna.ats.jta] (MSC service thread 1-3) Setting up node identifiers '[txdifferentnodeid]' for which recovery will be performed
> {code}
> {code:java}
> 2019-07-22 17:43:56,062 DEBUG [com.arjuna.ats.jta] (Periodic Recovery) node name of < formatId=131077, gtrid_length=32, bqual_length=36, tx_uid=0:ffff0a2804ed:26165251:5d35d902:3c, node_name=vkcd, branch_uid=0:ffff0a2804ed:26165251:5d35d902:46, subordinatenodename=null, eis_name=forgot eis name for: 1 > is vkcd
> 2019-07-22 17:43:56,062 DEBUG [com.arjuna.ats.jta] (Periodic Recovery) XAResourceOrphanFilter com.arjuna.ats.internal.jta.recovery.arjunacore.JTANodeNameXAResourceOrphanFilter voted ABSTAIN
> {code}
> *When does the scenario pass*
> When I run the TS with
> {code:java}
> -Dts.jbossts1.node.identifier=txdifferentnodeid -Dts.jbossts2.node.identifier=txdifferentnodeid
> {code}
> the test passes (old and new node-identifier on both servers are same)
> When step 3 slightly differs:
> When restart is performed instead of reload op.
> Server1 is reloaded. If it is restarted, node name is set correctly to txdifferentnodeid
> *tldr;*
> The problem is that server1 set TX node name to old value after node identifier is changed and server is reloaded. If the server is restarted, everything is OK.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 9 months
[JBoss JIRA] (WFLY-12317) Using JTA transaction's node_name attribute is set to an old value after node-identifier is changed
by Ondrej Chaloupka (Jira)
[ https://issues.jboss.org/browse/WFLY-12317?page=com.atlassian.jira.plugin... ]
Ondrej Chaloupka edited comment on WFLY-12317 at 8/1/19 10:17 AM:
------------------------------------------------------------------
Summation of effort on this issue.
* The cause of the issue is described in the Mike's comment above (https://issues.jboss.org/browse/WFLY-12317?focusedCommentId=13763985#comm...).
* Even the test worked before with {{:reload}} it was not correct and the correct behaviour needed (even in WFLY16) restart. The counterexample is - start WFLY with node-id {{1}}. Fill the object store with some transaction data (orphan detection needs to be triggered). Wait a 2 minutes for recovery to start. Try to change the node-id to {{2}} with cli operation and reload. Even the model and jboss cli shows the node-id is {{2}} Narayana still effectivelly uses the node-ide with value {{1}}. To fix this restart of the JVM is needed.
* Question is what is the right fix. There is a discussion at the forum https://developer.jboss.org/thread/280430 what is the right fix. I personally inclined to permit to change the node-id just with reload (the fix would be then like this https://github.com/ochaloup/wildfly/commit/c39536f621dd3be38fdc605cd3b884...). The other option is to enforce JVM reload (the fix would be then like this https://github.com/ochaloup/wildfly/commit/42d098a41e1f9e9854d080491849aa...).
So, for the test this means that it's wrongly written for WFLY16. It's not the problem of the test as it follows the attribute description in the model. But as the model description presents a wrong flag (reload requires instead of necessary restart requires) it's the issue of the model that the flag was set wrongly. (The fix for this issue can allow only the reload.)
was (Author: ochaloup):
Summation of effort on this issue.
* The cause of the issue is described in the Mike's comment above (https://issues.jboss.org/browse/WFLY-12317?focusedCommentId=13763985#comm...).
* Even the test worked before with {{:reload}} it was not correct and the correct behaviour needed (even in WFLY16) restart. The counterexample is - start WFLY with node-id {{1}}. Fill the object store with some transaction data (orphan detection needs to be triggered). Wait a 2 minutes for recovery to start. Try to change the node-id to {{2}} with cli operation and reload. Even the model and jboss cli shows the node-id is {{2}} Narayana still effectivelly uses the node-ide with value {{1}}. To fix this restart of the JVM is needed.
* Question is what is the right fix. There is a discussion at the forum https://developer.jboss.org/thread/280430 what is the right fix. I personally inclined to permit to change the node-id just with reload (the fix would be then like this https://github.com/ochaloup/wildfly/commit/c39536f621dd3be38fdc605cd3b884...). The other option is to enforce JVM reload (the fix would be then like this https://github.com/ochaloup/wildfly/commit/42d098a41e1f9e9854d080491849aa...).
So, for the test this means that it's wrongly written for WFLY16. It's not the problem of the test as it follows the attribute description in the model. But as the model description presents a wrong flag (reload requires instead of necessary restart requires) it's the issue of the model that the flag was set wrongly. Nevertheless in the fix for this issue can allow only the reload.
> Using JTA transaction's node_name attribute is set to an old value after node-identifier is changed
> ---------------------------------------------------------------------------------------------------
>
> Key: WFLY-12317
> URL: https://issues.jboss.org/browse/WFLY-12317
> Project: WildFly
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 17.0.0.Final
> Reporter: Ivan Straka
> Assignee: Michael Musgrove
> Priority: Critical
> Attachments: server1_TxDifferentNodeCrashRecoveryTestCase_prepareHalt_jta_server.log, server2_TxDifferentNodeCrashRecoveryTestCase_prepareHalt_jta_server.log
>
>
> We have following test scenario (3 servers) that fails:
> # node-identifier of server1, 2 & 3 is set to 'vkcd', 'FdOu' and 'GocW' (ts.jbosstsX.node.identifier property)
> # server2 is started, node-identifier is set to txdifferentnodeid and server2 is stopped
> # server1 is started, node-identifier is set to txdifferentnodeid and server1 is reloaded
> # server3 is running
> # client call an EJB bean (where a transaction is started) on the server1
> # the EJB sends JMS message to the server3 (broker)
> # the EJB enlists dummy xa resource
> # during 2PC the Server1 is halted when prepare on dummy xa resource is invoked
> # we move server1 object store directory to the server2
> # server2 is started
> # the server2 is expected to rollback whole transaction
> Transaction is unfinished because server2 has not performed rollback.
> {code:java}
> prepareHalt(org.jboss.as.test.jbossts.crashrec.differentnode.test.TxDifferentNodeCrashRecoveryTestCase) Time elapsed: 810.354 sec <<< FAILURE!
> java.lang.AssertionError: Some unfinished xids on messaging server - expected 0 but was 1
> at org.junit.Assert.fail(Assert.java:88)
> at org.jboss.as.test.jbossts.crashrec.differentnode.test.TxDifferentNodeCrashRecoveryTestCase.checkAfterTestExecution(TxDifferentNodeCrashRecoveryTestCase.java:792)
> at org.jboss.as.test.jbossts.crashrec.differentnode.test.TxDifferentNodeCrashRecoveryTestCase.prepareHalt(TxDifferentNodeCrashRecoveryTestCase.java:565)
> {code}
> In the beginning servers' node-identifier are set to some value (lets say A,B,C). Before test execution node-identifier of server1 and server2 is set to the same value, let's say X.
> I see in logs that the transaction's node_name is set to the old value (vkcd vs txdifferentnodeid in the example below) on server1. Thus the server2 has not performed rollback.
> See node_name
> Server1:
> {code:java}
> 2019-07-22 17:40:54,616 DEBUG [com.arjuna.ats.jta] (MSC service thread 1-5) Setting up node identifiers '[txdifferentnodeid]' for which recovery will be performed
> {code}
> {code:java}
> 2019-07-22 17:41:11,931 TRACE [com.arjuna.ats.jta] (default task-2) XAResourceRecord.XAResourceRecord ( < formatId=131077, gtrid_length=32, bqual_length=36, tx_uid=0:ffff0a2804ed:26165251:5d35d902:3c, node_name=vkcd, branch_uid=0:ffff0a2804ed:26165251:5d35d902:46, subordinatenodename=null, eis_name=java:/JmsXA NodeId:05b492ae-ac97-11e9-a446-2016b912eaa8 >, XAResourceWrapperImpl@4158c7ec[xaResource=org.jboss.activemq.artemis.wildfly.integration.WildFlyActiveMQXAResourceWrapper(a)4a21a45f pad=false overrideRmValue=null productName=ActiveMQ Artemis productVersion=2.0 jndiName=java:/JmsXA NodeId:05b492ae-ac97-11e9-a446-2016b912eaa8] ), record id=0:ffff0a2804ed:26165251:5d35d902:47
> {code}
> Server2:
>
> {code:java}
> 2019-07-22 17:41:15,397 DEBUG [com.arjuna.ats.jta] (MSC service thread 1-3) Setting up node identifiers '[txdifferentnodeid]' for which recovery will be performed
> {code}
> {code:java}
> 2019-07-22 17:43:56,062 DEBUG [com.arjuna.ats.jta] (Periodic Recovery) node name of < formatId=131077, gtrid_length=32, bqual_length=36, tx_uid=0:ffff0a2804ed:26165251:5d35d902:3c, node_name=vkcd, branch_uid=0:ffff0a2804ed:26165251:5d35d902:46, subordinatenodename=null, eis_name=forgot eis name for: 1 > is vkcd
> 2019-07-22 17:43:56,062 DEBUG [com.arjuna.ats.jta] (Periodic Recovery) XAResourceOrphanFilter com.arjuna.ats.internal.jta.recovery.arjunacore.JTANodeNameXAResourceOrphanFilter voted ABSTAIN
> {code}
> *When does the scenario pass*
> When I run the TS with
> {code:java}
> -Dts.jbossts1.node.identifier=txdifferentnodeid -Dts.jbossts2.node.identifier=txdifferentnodeid
> {code}
> the test passes (old and new node-identifier on both servers are same)
> When step 3 slightly differs:
> When restart is performed instead of reload op.
> Server1 is reloaded. If it is restarted, node name is set correctly to txdifferentnodeid
> *tldr;*
> The problem is that server1 set TX node name to old value after node identifier is changed and server is reloaded. If the server is restarted, everything is OK.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 9 months