[
http://jira.jboss.com/jira/browse/JBTM-354?page=comments#action_12415760 ]
Jonathan Halliday commented on JBTM-354:
----------------------------------------
TransactionImple.enlistResource got changed as a result of JBTM-362. The new version is
xaRes.start(xid, xaStartNormal);
if(_theTransaction.add(abstractRecord) == AddOutcome.AR_ADDED) {
_resources.put(xaRes, new TxInfo(xid));
return true; // dive out, no need to set associatedWork = true;
} else {
// we called start on the resource, but _theTransaction did not accept it.
// we therefore have a mess which we must now clean up by ensuring the start is undone:
abstractRecord.topLevelAbort();
}
As a side effect, a slow XAResource.start method now is less of an issue and hence the
above test case no longer reliably reproduces the problem. In such case the
_theTransaction.add call, which is what adds the resource to the pending list walked by
the reaper (don't be fooled by the _resources.put, the reaper ignores that list), will
fail as the tx is already aborted by the reaper at the time the start() returns. In which
case, we call end, rollback on the resource via the abstractRecord.topLevelAbort() rather
than via the reaper thread.
However, we still have a race in our code. The BasicAction.add done by enlistResource
accesses pendingList and is not synched with respect to the reaper's
AtomicAction.cancel() (i.e. BasicAction.Abort) which walks that list.
The fix appears to be in two parts: reinstate BasicAction crital[Start|End] as noted by
Mark and also add calls to these into the Abort method or they won't actually do us
much good.
_theTransaction is an AtomicAction (extends TwoPhaseCoordinator, TwoPhaseCoordinator
extends BasicAction)
race condition in XAResource start/end handling due to async
rollback
---------------------------------------------------------------------
Key: JBTM-354
URL:
http://jira.jboss.com/jira/browse/JBTM-354
Project: JBoss Transaction Manager
Issue Type: Bug
Security Level: Public(Everyone can see)
Components: JTA Implementation, JTS Implementation
Affects Versions: 4.3.0.GA, 4.2.3.SP5
Reporter: Jonathan Halliday
Assigned To: Jonathan Halliday
Fix For: 4.2.3.SP8, 4.4.CR1
There exists a race condition such that we may call start on an XAResource but never call
end (or anything else) on it if the transaction times out whilst the resource enlistment
is still in progress. This is a Bad Thing and may make resource managers unhappy.
To reproduce:
TransactionManager tm = new TransactionManagerImple();
tm.setTransactionTimeout(5);
tm.begin();
Transaction t = tm.getTransaction();
// XAResourceImpl has a start method with Thread.sleep longer than 5s
XAResource xaResource = new XAResourceImpl();
t.enlistResource(xaResource);
Thread.sleep(10);
// reaper times out the tx, resource manager for XAResource
// *may* have received start but never receives end.
Fix notes: TransactionImple.enlistResource needs to lock out the reaper thread. At first
glance an alternative fix is to put the resource into the transaction's internal data
structure before calling start on it, but that would allow an 'end, begin'
sequence of calls to be seen by the resource manager.
caution: ref support case i-t #171373 there *may* exist additional race conditions in the
app server JCA so this change alone won't necessarily fix the problem.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira