InboundBridge recovery aborts live transactions
-----------------------------------------------
Key: JBTM-3080
URL:
https://issues.jboss.org/browse/JBTM-3080
Project: JBoss Transaction Manager
Issue Type: Bug
Environment: EAP 7.1.5 / 5.5.32.Final
Reporter: Jonathan Halliday
Assignee: Tom Jenkinson
Priority: Critical
During a recovery pass, the InboundBridgeRecoveryManager scans for subordinate XA
branches that may need cleanup. The filtering process applied by checkXid correctly
excludes tx that are not owned by the bridge, but fails to ignore those that are owned but
also still live. Therefore, a race condition exists such that the recovery process may
incorrectly abort a branch if invoked between the prepare and commit steps, resulting in
data corruption relative to other committed branches from the parent i.e. Heuristic
outcomes.
TRACE [org.jboss.jbossts.txbridge] (TaskWorker-3)
BridgeDurableParticipant.prepare(Xid=< 131080, 35, 64, ... >)
TRACE [org.jboss.jbossts.txbridge] (Periodic Recovery) rolling back orphaned subordinate
tx < 131080, 35, 64, ... >
ERROR [org.jboss.jbossts.txbridge] (TaskWorker-8) ARJUNA033004: commit on Xid=<
131080, 35, 64, ... > failed: javax.transaction.xa.XAException
It is necessary to enhance checkXid to validate against known live tx. The easiest way
would seem to be to have the InboundBridgeManager singleton hold a lookup table of live
tx, effectively a secondary index into its existing collection of InboundBridges, against
which the recovery system can validate the in-doubt Xid.