]
Ondra Chaloupka commented on WFLY-10053:
----------------------------------------
[~flavia.rainone] if you mean the issue JBEAP-12611 then I think it's different. In
case of JBEAP-12611 the expected outcome is rolled-back the global transaction
(rolling-back all participants). The crash happens before end of the 2PC prepare phase.
Then orphan detection should be run and rollback the prepared participants.
In this case the prepare phase of 2PC was successfully finished and the expected outcome
is commit of the global transaction (committing all participants).
In case of the JBEAP-12611 the issue was that the orphan was forgotten forever as there
was nothing about the transaction saved at the side of the transaction manager.
Transaction manager persists information about transaction (for JTA case, JTS behaves
differently) when whole prepare finishes successfully. In case of JBEAP-12611 the crash
happens during prepare phase thus transaction manager persists nothing. But as the
subordinate (at server2) transaction's participants were already prepared then after
restarts there has to be somebody to roll-back them. Transaction manager on server1 then
checks all the known parties if they know about some unfinished transactions. If there is
some and such is not part of the server1 transaction object store then they are commanded
to be rolled-back. In case JBEAP-12611 the ejb remoting was not informing about unfinished
transactions at server2 side and thus server1 was not capable to roll-back them.
While in case of this issue the participant was already saved in the transaction object
store. And transaction manager really tries to commit it
({{org.jboss.ejb.client.remoting.RemotingConnectionEJBReceiver.sendCommit(RemotingConnectionEJBReceiver.java:511)}},
in contrast with the JBEAP-12611 where record is unavailable and no roll-back attempt is
done). The failure happens at time of trying to invoke commit where connection was closed
(?) for some reason and the commit command could not be finished.
TM is not forward compatible
-----------------------------
Key: WFLY-10053
URL:
https://issues.jboss.org/browse/WFLY-10053
Project: WildFly
Issue Type: Bug
Components: Transactions
Affects Versions: 12.0.0.Final
Reporter: Ivan Straka
Assignee: Tom Jenkinson
Attachments: EAP7.0.9.log, WF12.log
TM is unable to recover transaction after one of the nodes crash during commit phase of a
resource.
Scenario:
# EAP 7.0.9 enlists dummy XA resource, update db value (XADatasource) and call WildFly
12
# WildFly 12 enlists dummy XA resource, update other db value
# At entry of commit phase of dummy XA resource on WildFly12 byteman crash the server
# WildFly 12 is started
# Recovery process should recover the transaction
Following exception is seen in EAP 7.0.9 log
{code:java}
2018-03-20 12:42:35,171 WARN [com.arjuna.ats.jta] (Periodic Recovery) ARJUNA016036:
commit on < formatId=131077, gtrid_length=32, bqual_length=36,
tx_uid=0:ffff7f000001:643452d5:5ab0f37d:f, node_name=eap1,
branch_uid=0:ffff7f000001:643452d5:5ab0f37d:1e, subordinatenodename=null, eis_name=unknown
eis name > (RecoveryOnlySerializedEJBXAResource{ejbReceiverNodeName='eap2'})
failed with exception $XAException.XA_RETRY: javax.transaction.xa.XAException
at
org.jboss.ejb.client.RecoveryOnlySerializedEJBXAResource.commit(RecoveryOnlySerializedEJBXAResource.java:56)
at
com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord.topLevelCommit(XAResourceRecord.java:477)
at com.arjuna.ats.arjuna.coordinator.BasicAction.doCommit(BasicAction.java:2869)
at com.arjuna.ats.arjuna.coordinator.BasicAction.doCommit(BasicAction.java:2785)
at com.arjuna.ats.arjuna.coordinator.BasicAction.phase2Commit(BasicAction.java:1853)
at
com.arjuna.ats.arjuna.recovery.RecoverAtomicAction.replayPhase2(RecoverAtomicAction.java:71)
at
com.arjuna.ats.internal.arjuna.recovery.AtomicActionRecoveryModule.doRecoverTransaction(AtomicActionRecoveryModule.java:152)
at
com.arjuna.ats.internal.arjuna.recovery.AtomicActionRecoveryModule.processTransactionsStatus(AtomicActionRecoveryModule.java:253)
at
com.arjuna.ats.internal.arjuna.recovery.AtomicActionRecoveryModule.periodicWorkSecondPass(AtomicActionRecoveryModule.java:109)
at
com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.doWorkInternal(PeriodicRecovery.java:811)
at
com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.run(PeriodicRecovery.java:377)
{code}
Result is that resources on EAP7.0.9 are commited and resources on WF12 are not:
{code:java}
[INFO] there is (are) 1 transaction(s) in tx obj store
/home/istraka/manu-units/tests-transactions-propagation/manu-unit-eap-transactions-propagation/out/EAP2_7.2.0/workspace/wildfly-12.0.0.Final/standalone/data/tx-object-store
[INFO] Transaction
Xid:< formatId=131077, gtrid_length=32, bqual_length=40,
tx_uid=0:ffff7f000001:-174ad8d9:5ab0fb3b:f, node_name=eap1,
branch_uid=0:ffff7f000001:-174ad8d9:5ab0fb3b:1e, subordinatenodename=eap2, eis_name=0 >
Type:StateManager/BasicAction/TwoPhaseCoordinator/AtomicAction/SubordinateAtomicAction/JCA
ParentNodeName:eap1
Participant:false
CreationTime:Tue, 20 Mar 2018 13:15:18 +0100
AgeInSeconds:269
Id:0:ffff7f000001:39e6900e:5ab0fb4a:13
Participant
HeuristicValue:-1
Status:PREPARED
JndiName:java:/TestXAResource
GlobalTransactionId:[B@5753c786
HeuristicStatus:UNKNOWN
NodeName:null
Timeout:0
FormatId:-1
BranchQualifier:[B@1874640
Type:/StateManager/AbstractRecord/XAResourceRecord
Participant:true
EisProductVersion:EAP Test
ClassName:com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord
EisProductName:Crash Recovery Test
Id:0:ffff7f000001:39e6900e:5ab0fb4a:1d
Participant
HeuristicValue:-1
Status:PREPARED
JndiName:java:jboss/eap2-ds-jndi
GlobalTransactionId:[B@4ae75d2d
HeuristicStatus:UNKNOWN
NodeName:null
Timeout:0
FormatId:-1
BranchQualifier:[B@5eaa38ce
Type:/StateManager/AbstractRecord/XAResourceRecord
Participant:true
EisProductVersion:9.3.15
ClassName:com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord
EisProductName:PostgreSQL
Id:0:ffff7f000001:39e6900e:5ab0fb4a:23
{code}
The issue is valid for:
* 6.4.x -> WF12
* 7.0.X -> WF12
The issue is *not* valid for:
* WF12 -> 6.4.x
* WF12-> 7.0.x
* 7.1.0 -> WF 12
If you have any question (about testsuite, scenario,...) feel free to ask.