[jboss-jira] [JBoss JIRA] (WFLY-10053) TM is not forward compatible

Ondra Chaloupka (JIRA) issues at jboss.org
Fri Apr 6 09:25:01 EDT 2018


    [ https://issues.jboss.org/browse/WFLY-10053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13557407#comment-13557407 ] 

Ondra Chaloupka commented on WFLY-10053:
----------------------------------------

[~flavia.rainone] if you mean the issue JBEAP-12611 then I think it's different. In case of JBEAP-12611 the expected outcome is rolled-back the global transaction (rolling-back all participants). The crash happens before end of the 2PC prepare phase. Then orphan detection should be run and rollback the prepared participants.
In this case the prepare phase of 2PC was successfully finished and the expected outcome is commit of the global transaction (committing all participants). 

In case of the JBEAP-12611 the issue was that the orphan was forgotten forever as there was nothing about the transaction saved at the side of the transaction manager. Transaction manager persists information about transaction (for JTA case, JTS behaves differently) when whole prepare finishes successfully. In case of JBEAP-12611 the crash happens during prepare phase thus transaction manager persists nothing. But as the subordinate (at server2) transaction's participants were already prepared then after restarts there has to be somebody to roll-back them. Transaction manager on server1 then checks all the known parties if they know about some unfinished transactions. If there is some and such is not part of the server1 transaction object store then they are commanded to be rolled-back. In case JBEAP-12611 the ejb remoting was not informing about unfinished transactions at server2 side and thus server1 was not capable to roll-back them.

While in case of this issue the participant was already saved in the transaction object store. And transaction manager really tries to commit it ({{org.jboss.ejb.client.remoting.RemotingConnectionEJBReceiver.sendCommit(RemotingConnectionEJBReceiver.java:511)}}, in contrast with the JBEAP-12611 where record is unavailable and no roll-back attempt is done). The failure happens at time of trying to invoke commit where connection was closed (?) for some reason and the commit command could not be finished.

> TM is not forward compatible 
> -----------------------------
>
>                 Key: WFLY-10053
>                 URL: https://issues.jboss.org/browse/WFLY-10053
>             Project: WildFly
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 12.0.0.Final
>            Reporter: Ivan Straka
>            Assignee: Tom Jenkinson
>         Attachments: EAP7.0.9.log, WF12.log
>
>
> TM is unable to recover transaction after one of the nodes crash during commit phase of a resource. 
> Scenario:
> # EAP 7.0.9 enlists dummy XA resource, update db value (XADatasource) and call WildFly 12
> # WildFly 12 enlists dummy XA resource, update other db value
> # At entry of commit phase of dummy XA resource on WildFly12 byteman crash the server
> # WildFly 12 is started
> # Recovery process should recover the transaction
> Following exception is seen in EAP 7.0.9 log
> {code:java}
> 2018-03-20 12:42:35,171 WARN  [com.arjuna.ats.jta] (Periodic Recovery) ARJUNA016036: commit on < formatId=131077, gtrid_length=32, bqual_length=36, tx_uid=0:ffff7f000001:643452d5:5ab0f37d:f, node_name=eap1, branch_uid=0:ffff7f000001:643452d5:5ab0f37d:1e, subordinatenodename=null, eis_name=unknown eis name > (RecoveryOnlySerializedEJBXAResource{ejbReceiverNodeName='eap2'}) failed with exception $XAException.XA_RETRY: javax.transaction.xa.XAException
> 	at org.jboss.ejb.client.RecoveryOnlySerializedEJBXAResource.commit(RecoveryOnlySerializedEJBXAResource.java:56)
> 	at com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord.topLevelCommit(XAResourceRecord.java:477)
> 	at com.arjuna.ats.arjuna.coordinator.BasicAction.doCommit(BasicAction.java:2869)
> 	at com.arjuna.ats.arjuna.coordinator.BasicAction.doCommit(BasicAction.java:2785)
> 	at com.arjuna.ats.arjuna.coordinator.BasicAction.phase2Commit(BasicAction.java:1853)
> 	at com.arjuna.ats.arjuna.recovery.RecoverAtomicAction.replayPhase2(RecoverAtomicAction.java:71)
> 	at com.arjuna.ats.internal.arjuna.recovery.AtomicActionRecoveryModule.doRecoverTransaction(AtomicActionRecoveryModule.java:152)
> 	at com.arjuna.ats.internal.arjuna.recovery.AtomicActionRecoveryModule.processTransactionsStatus(AtomicActionRecoveryModule.java:253)
> 	at com.arjuna.ats.internal.arjuna.recovery.AtomicActionRecoveryModule.periodicWorkSecondPass(AtomicActionRecoveryModule.java:109)
> 	at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.doWorkInternal(PeriodicRecovery.java:811)
> 	at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.run(PeriodicRecovery.java:377)
> {code}
> Result is that resources on EAP7.0.9 are commited and resources on WF12 are not:
> {code:java}
> [INFO] there is (are) 1 transaction(s) in tx obj store /home/istraka/manu-units/tests-transactions-propagation/manu-unit-eap-transactions-propagation/out/EAP2_7.2.0/workspace/wildfly-12.0.0.Final/standalone/data/tx-object-store
> [INFO] Transaction
> 		Xid:< formatId=131077, gtrid_length=32, bqual_length=40, tx_uid=0:ffff7f000001:-174ad8d9:5ab0fb3b:f, node_name=eap1, branch_uid=0:ffff7f000001:-174ad8d9:5ab0fb3b:1e, subordinatenodename=eap2, eis_name=0 > 
> 		Type:StateManager/BasicAction/TwoPhaseCoordinator/AtomicAction/SubordinateAtomicAction/JCA 
> 		ParentNodeName:eap1 
> 		Participant:false 
> 		CreationTime:Tue, 20 Mar 2018 13:15:18 +0100 
> 		AgeInSeconds:269 
> 		Id:0:ffff7f000001:39e6900e:5ab0fb4a:13 
> 		Participant
> 			HeuristicValue:-1 
> 			Status:PREPARED 
> 			JndiName:java:/TestXAResource 
> 			GlobalTransactionId:[B at 5753c786 
> 			HeuristicStatus:UNKNOWN 
> 			NodeName:null 
> 			Timeout:0 
> 			FormatId:-1 
> 			BranchQualifier:[B at 1874640 
> 			Type:/StateManager/AbstractRecord/XAResourceRecord 
> 			Participant:true 
> 			EisProductVersion:EAP Test 
> 			ClassName:com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord 
> 			EisProductName:Crash Recovery Test 
> 			Id:0:ffff7f000001:39e6900e:5ab0fb4a:1d 
> 		Participant
> 			HeuristicValue:-1 
> 			Status:PREPARED 
> 			JndiName:java:jboss/eap2-ds-jndi 
> 			GlobalTransactionId:[B at 4ae75d2d 
> 			HeuristicStatus:UNKNOWN 
> 			NodeName:null 
> 			Timeout:0 
> 			FormatId:-1 
> 			BranchQualifier:[B at 5eaa38ce 
> 			Type:/StateManager/AbstractRecord/XAResourceRecord 
> 			Participant:true 
> 			EisProductVersion:9.3.15 
> 			ClassName:com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord 
> 			EisProductName:PostgreSQL 
> 			Id:0:ffff7f000001:39e6900e:5ab0fb4a:23 
> {code}
> The issue is valid for:
> * 6.4.x -> WF12
> * 7.0.X -> WF12
> The issue is *not* valid for:
> * WF12 -> 6.4.x
> * WF12-> 7.0.x
> * 7.1.0 -> WF 12
> If you have any question (about testsuite, scenario,...) feel free to ask. 



--
This message was sent by Atlassian JIRA
(v7.5.0#75005)


More information about the jboss-jira mailing list