[JBoss JIRA] (JBTM-2148) Consider handling RuntimeExceptions arising from badly behaved XAResource implementations
by Tom Jenkinson (JIRA)
[ https://issues.jboss.org/browse/JBTM-2148?page=com.atlassian.jira.plugin.... ]
Tom Jenkinson commented on JBTM-2148:
-------------------------------------
If we reach consensus, we could try to raise a JTA_SPEC issue for this.
> Consider handling RuntimeExceptions arising from badly behaved XAResource implementations
> -----------------------------------------------------------------------------------------
>
> Key: JBTM-2148
> URL: https://issues.jboss.org/browse/JBTM-2148
> Project: JBoss Transaction Manager
> Issue Type: Enhancement
> Security Level: Public(Everyone can see)
> Components: JTA
> Affects Versions: 4.17.19, 5.0.1
> Reporter: Tom Jenkinson
> Assignee: Tom Jenkinson
>
> It has been observed that some XAResource implementations throw RuntimeExceptions from their XAResource::end method.
> Although this is not spec compliant, it does mean that the transaction will never complete and as such afterCompletion would never be called, nor would we attempt to complete the remaining branches in the transaction.
> In ArjunaCore terms the transaction would be left in the state of ActionStatus.ABORTING.
> This Jira exists as a possible route for users to vote for addressing this issue. One possibility would be to try to align a RuntimeException with a response of XAResource.XA_RETRY where possible.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 7 months
[JBoss JIRA] (JBTM-2148) Consider handling RuntimeExceptions arising from badly behaved XAResource implementations
by Tom Jenkinson (JIRA)
Tom Jenkinson created JBTM-2148:
-----------------------------------
Summary: Consider handling RuntimeExceptions arising from badly behaved XAResource implementations
Key: JBTM-2148
URL: https://issues.jboss.org/browse/JBTM-2148
Project: JBoss Transaction Manager
Issue Type: Enhancement
Security Level: Public (Everyone can see)
Components: JTA
Affects Versions: 5.0.1, 4.17.19
Reporter: Tom Jenkinson
Assignee: Tom Jenkinson
It has been observed that some XAResource implementations throw RuntimeExceptions from their XAResource::end method.
Although this is not spec compliant, it does mean that the transaction will never complete and as such afterCompletion would never be called, nor would we attempt to complete the remaining branches in the transaction.
In ArjunaCore terms the transaction would be left in the state of ActionStatus.ABORTING.
This Jira exists as a possible route for users to vote for addressing this issue. One possibility would be to try to align a RuntimeException with a response of XAResource.XA_RETRY where possible.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 7 months
[JBoss JIRA] (JBTM-2147) afterCompletion may be called before transaction _completes_ if reaper fires for a transaction and XAResource::end throws RuntimeException
by Tom Jenkinson (JIRA)
[ https://issues.jboss.org/browse/JBTM-2147?page=com.atlassian.jira.plugin.... ]
Tom Jenkinson updated JBTM-2147:
--------------------------------
Status: Resolved (was: Pull Request Sent)
Fix Version/s: 5.0.2
Resolution: Done
Merged - thanks for the report.
> afterCompletion may be called before transaction _completes_ if reaper fires for a transaction and XAResource::end throws RuntimeException
> ------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: JBTM-2147
> URL: https://issues.jboss.org/browse/JBTM-2147
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: JTS
> Affects Versions: 4.17.7
> Environment: jboss EAP 6.1.1
> Reporter: Koen Janssens
> Assignee: Tom Jenkinson
> Fix For: 5.0.2
>
>
> EDITED BY TOM (to remove the report about the transaction not being disassociated):
> If the reaper calls cancel on a transaction that is wedged, the transaction stays in 'aborting' state but it notifes Synchronizations that the tx has ended contrary to the spec.
> This can be reproduced as follows:
> * A TX is started and both a DB (last)resource and horntq (XA) resource get enlisted. The tx takes a long time and times out.
> * Arjuna reaper thread notices the time out and starts a worker thread to cancel the TX.
> * Before the worker thread can 'abort' the hornetq XA resource, the arjuna worker thread is interruped (txReaperCancelWaitPeriod expires) by arjuna
> * Since the worker thread is interruped, the hornetq XA resource throws an 'HornetQInterruptedException'
> * This unexpected exception causes arjuna to notify registered javax/transaction/Synchronization's (and return DB connection to the pool), without ending the Tx.
> ORIGINAL DESCRIPTION:
> We have noticed that arjuna leaks transactions when something 'unexpected' happens during tx aborting. The transaction stays in 'aborting' state while it does notify TxLIsteners that the tx has ended.
> This can be reproduced as follows:
> * A TX is started and both a DB (last)resource and horntq (XA) resource get enlisted. The tx takes a long time and times out.
> * Arjuna reaper thread notices the time out and starts a worker thread to cancel the TX.
> * Before the worker thread can 'abort' the hornetq XA resource, the arjuna worker thread is interruped (txReaperCancelWaitPeriod expires) by arjuna
> * Since the worker thread is interruped, the hornetq XA resource throws an 'HornetQInterruptedException'
> * This unexpected exception causes arjuna to notify registered javax/transaction/Synchronization's (and return DB connection to the pool), without ending the Tx.
> From then on, jboss logs are full of:
> Trying to start a new transaction when old is not complete: Old: < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff40bab98c:1f71a485:5335ece7:3625a31, node_name=1, branch_uid=0:ffff40bab98c:1f71a485:5335ece7:3625bbc, subordinatenodename=null, eis_name=java:/XAOracleDS >, New < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff40bab98c:1f71a485:5335ece7:3641efc, node_name=1, branch_uid=0:ffff40bab98c:1f71a485:5335ece7:36420a1, subordinatenodename=null, eis_name=java:/XAOracleDS >, Flags 0
> Although the root cause is a misbehaving hornetq resource, I think arjuna should be a bit more resilient.
> More background in https://access.redhat.com/support/cases/01061583/
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 7 months
[JBoss JIRA] (JBTM-1674) btadmin.PauseDomainTest failed with ArrayIndexOutOfBoundsException
by Amos Feng (JIRA)
[ https://issues.jboss.org/browse/JBTM-1674?page=com.atlassian.jira.plugin.... ]
Amos Feng commented on JBTM-1674:
---------------------------------
DEBUG (HybridSocketEndpointQueue:296 ) - send on 0.0.0.0:0 with 1 bytes and buffer:
it looks like the apr_psprintf function returns null string.
> btadmin.PauseDomainTest failed with ArrayIndexOutOfBoundsException
> ------------------------------------------------------------------
>
> Key: JBTM-1674
> URL: https://issues.jboss.org/browse/JBTM-1674
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: BlackTie
> Reporter: Amos Feng
> Assignee: Amos Feng
> Labels: linux64el5, linux64el6
> Fix For: 5.0.2
>
>
> http://172.17.131.2/view/Narayana+BlackTie/job/blacktie-linux64-el5/1501
> {noformat}
> -------------------------------------------------------------------------------
> Test set: org.jboss.narayana.blacktie.btadmin.PauseDomainTest
> -------------------------------------------------------------------------------
> Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 83.154 sec <<< FAILURE!
> testPauseDomainWithArg(org.jboss.narayana.blacktie.btadmin.PauseDomainTest) Time elapsed: 53.947 sec <<< FAILURE!
> junit.framework.AssertionFailedError: Command failed
> at junit.framework.Assert.fail(Assert.java:47)
> at org.jboss.narayana.blacktie.btadmin.PauseDomainTest.setUp(PauseDomainTest.java:48)
> at junit.framework.TestCase.runBare(TestCase.java:125)
> at junit.framework.TestResult$1.protect(TestResult.java:106)
> at junit.framework.TestResult.runProtected(TestResult.java:124)
> at junit.framework.TestResult.run(TestResult.java:109)
> at junit.framework.TestCase.run(TestCase.java:118)
> at junit.framework.TestSuite.runTest(TestSuite.java:208)
> at junit.framework.TestSuite.run(TestSuite.java:203)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at org.apache.maven.surefire.junit.JUnitTestSet.execute(JUnitTestSet.java:98)
> at org.apache.maven.surefire.junit.JUnit3Provider.executeTestSet(JUnit3Provider.java:107)
> at org.apache.maven.surefire.junit.JUnit3Provider.invoke(JUnit3Provider.java:84)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at org.apache.maven.surefire.booter.ProviderFactory$ClassLoaderProxy.invoke(ProviderFactory.java:103)
> at com.sun.proxy.$Proxy0.invoke(Unknown Source)
> at org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:150)
> at org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcess(SurefireStarter.java:91)
> at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:69)
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 7 months
[JBoss JIRA] (JBTM-2147) afterCompletion may be called before transaction _completes_ if reaper fires for a transaction and XAResource::end throws RuntimeException
by Tom Jenkinson (JIRA)
[ https://issues.jboss.org/browse/JBTM-2147?page=com.atlassian.jira.plugin.... ]
Tom Jenkinson edited comment on JBTM-2147 at 4/10/14 8:50 AM:
--------------------------------------------------------------
Hi,
Thanks for the report. Your latest comment is slightly different to your initial one (which also talks about the state of the server insofar as the transaction has not been disassociated from the thread), but it is now something I am happy to fix. Effectively you are saying that calling abort on a cached reference to a transaction (such as the reaper has) can cause the afterCompletion of Synchronizations to be triggered if an XAResource throws RuntimeException out of end.
What I propose is to fix that specific issue, i.e. prevent afterCompletion being called on subsequent attempts to rollback the transaction if it is broken by the XAR. You are correct that that is a state machine issue in Narayana that subsequent calls to rollback should not trigger afterCompletion.
I won't make further changes to the transaction manager to handle RuntimeExceptions from XAR::end and say disassociate the transaction from the thread as this is not spec compliant so the resource is the thing that should be fixed for that.
Note, you will still get a single ARJUNA012078: Abort called illegaly on atomic action 0:ffff40bab98c:1f71a485:5335ece7:3625a31) once in the log, but that would be expected in this scenario.
Thanks again for the report and I hope you concur with my approach,
Tom
was (Author: tomjenkinson):
Hi,
Thanks for the report. Your latest comment is slightly different to your initial one, but it is now something I am happy to fix. Effectively you are saying the a cached reference to a transaction (such as the reaper has) can cause the afterCompletion of Synchronizations to be triggered if an XAResource throws RuntimeException out of end.
What I propose is to fix that specific issue, i.e. prevent afterCompletion being called on subsequent attempts to rollback the transaction if it is broken by the XAR. You are correct that that is a state machine issue in Narayana that subsequent calls to rollback should not trigger afterCompletion.
I won't make further changes to the transaction manager to handle RuntimeExceptions from XAR::end and say disassociate the transaction from the thread as this is not spec compliant so the resource is the thing that should be fixed for that.
Note, you will still get a single ARJUNA012078: Abort called illegaly on atomic action 0:ffff40bab98c:1f71a485:5335ece7:3625a31) in the log, but that would be expected in this scenario.
Thanks again for the report and I hope you concur with my approach,
Tom
> afterCompletion may be called before transaction _completes_ if reaper fires for a transaction and XAResource::end throws RuntimeException
> ------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: JBTM-2147
> URL: https://issues.jboss.org/browse/JBTM-2147
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: JTS
> Affects Versions: 4.17.7
> Environment: jboss EAP 6.1.1
> Reporter: Koen Janssens
> Assignee: Tom Jenkinson
>
> EDITED BY TOM (to remove the report about the transaction not being disassociated):
> If the reaper calls cancel on a transaction that is wedged, the transaction stays in 'aborting' state but it notifes Synchronizations that the tx has ended contrary to the spec.
> This can be reproduced as follows:
> * A TX is started and both a DB (last)resource and horntq (XA) resource get enlisted. The tx takes a long time and times out.
> * Arjuna reaper thread notices the time out and starts a worker thread to cancel the TX.
> * Before the worker thread can 'abort' the hornetq XA resource, the arjuna worker thread is interruped (txReaperCancelWaitPeriod expires) by arjuna
> * Since the worker thread is interruped, the hornetq XA resource throws an 'HornetQInterruptedException'
> * This unexpected exception causes arjuna to notify registered javax/transaction/Synchronization's (and return DB connection to the pool), without ending the Tx.
> ORIGINAL DESCRIPTION:
> We have noticed that arjuna leaks transactions when something 'unexpected' happens during tx aborting. The transaction stays in 'aborting' state while it does notify TxLIsteners that the tx has ended.
> This can be reproduced as follows:
> * A TX is started and both a DB (last)resource and horntq (XA) resource get enlisted. The tx takes a long time and times out.
> * Arjuna reaper thread notices the time out and starts a worker thread to cancel the TX.
> * Before the worker thread can 'abort' the hornetq XA resource, the arjuna worker thread is interruped (txReaperCancelWaitPeriod expires) by arjuna
> * Since the worker thread is interruped, the hornetq XA resource throws an 'HornetQInterruptedException'
> * This unexpected exception causes arjuna to notify registered javax/transaction/Synchronization's (and return DB connection to the pool), without ending the Tx.
> From then on, jboss logs are full of:
> Trying to start a new transaction when old is not complete: Old: < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff40bab98c:1f71a485:5335ece7:3625a31, node_name=1, branch_uid=0:ffff40bab98c:1f71a485:5335ece7:3625bbc, subordinatenodename=null, eis_name=java:/XAOracleDS >, New < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff40bab98c:1f71a485:5335ece7:3641efc, node_name=1, branch_uid=0:ffff40bab98c:1f71a485:5335ece7:36420a1, subordinatenodename=null, eis_name=java:/XAOracleDS >, Flags 0
> Although the root cause is a misbehaving hornetq resource, I think arjuna should be a bit more resilient.
> More background in https://access.redhat.com/support/cases/01061583/
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 7 months
[JBoss JIRA] (JBTM-2147) afterCompletion may be called before transaction _completes_ if reaper fires for a transaction and XAResource::end throws RuntimeException
by Tom Jenkinson (JIRA)
[ https://issues.jboss.org/browse/JBTM-2147?page=com.atlassian.jira.plugin.... ]
Tom Jenkinson updated JBTM-2147:
--------------------------------
Description:
EDITED BY TOM (to remove the report about the transaction not being disassociated):
If the reaper calls cancel on a transaction that is wedged, the transaction stays in 'aborting' state but it notifes Synchronizations that the tx has ended contrary to the spec.
This can be reproduced as follows:
* A TX is started and both a DB (last)resource and horntq (XA) resource get enlisted. The tx takes a long time and times out.
* Arjuna reaper thread notices the time out and starts a worker thread to cancel the TX.
* Before the worker thread can 'abort' the hornetq XA resource, the arjuna worker thread is interruped (txReaperCancelWaitPeriod expires) by arjuna
* Since the worker thread is interruped, the hornetq XA resource throws an 'HornetQInterruptedException'
* This unexpected exception causes arjuna to notify registered javax/transaction/Synchronization's (and return DB connection to the pool), without ending the Tx.
ORIGINAL DESCRIPTION:
We have noticed that arjuna leaks transactions when something 'unexpected' happens during tx aborting. The transaction stays in 'aborting' state while it does notify TxLIsteners that the tx has ended.
This can be reproduced as follows:
* A TX is started and both a DB (last)resource and horntq (XA) resource get enlisted. The tx takes a long time and times out.
* Arjuna reaper thread notices the time out and starts a worker thread to cancel the TX.
* Before the worker thread can 'abort' the hornetq XA resource, the arjuna worker thread is interruped (txReaperCancelWaitPeriod expires) by arjuna
* Since the worker thread is interruped, the hornetq XA resource throws an 'HornetQInterruptedException'
* This unexpected exception causes arjuna to notify registered javax/transaction/Synchronization's (and return DB connection to the pool), without ending the Tx.
>From then on, jboss logs are full of:
Trying to start a new transaction when old is not complete: Old: < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff40bab98c:1f71a485:5335ece7:3625a31, node_name=1, branch_uid=0:ffff40bab98c:1f71a485:5335ece7:3625bbc, subordinatenodename=null, eis_name=java:/XAOracleDS >, New < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff40bab98c:1f71a485:5335ece7:3641efc, node_name=1, branch_uid=0:ffff40bab98c:1f71a485:5335ece7:36420a1, subordinatenodename=null, eis_name=java:/XAOracleDS >, Flags 0
Although the root cause is a misbehaving hornetq resource, I think arjuna should be a bit more resilient.
More background in https://access.redhat.com/support/cases/01061583/
was:
We have noticed that arjuna leaks transactions when something 'unexpected' happens during tx aborting. The transaction stays in 'aborting' state while it does notify TxLIsteners that the tx has ended.
This can be reproduced as follows:
* A TX is started and both a DB (last)resource and horntq (XA) resource get enlisted. The tx takes a long time and times out.
* Arjuna reaper thread notices the time out and starts a worker thread to cancel the TX.
* Before the worker thread can 'abort' the hornetq XA resource, the arjuna worker thread is interruped (txReaperCancelWaitPeriod expires) by arjuna
* Since the worker thread is interruped, the hornetq XA resource throws an 'HornetQInterruptedException'
* This unexpected exception causes arjuna to notify registered javax/transaction/Synchronization's (and return DB connection to the pool), without ending the Tx.
>From then on, jboss logs are full of:
Trying to start a new transaction when old is not complete: Old: < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff40bab98c:1f71a485:5335ece7:3625a31, node_name=1, branch_uid=0:ffff40bab98c:1f71a485:5335ece7:3625bbc, subordinatenodename=null, eis_name=java:/XAOracleDS >, New < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff40bab98c:1f71a485:5335ece7:3641efc, node_name=1, branch_uid=0:ffff40bab98c:1f71a485:5335ece7:36420a1, subordinatenodename=null, eis_name=java:/XAOracleDS >, Flags 0
Although the root cause is a misbehaving hornetq resource, I think arjuna should be a bit more resilient.
More background in https://access.redhat.com/support/cases/01061583/
> afterCompletion may be called before transaction _completes_ if reaper fires for a transaction and XAResource::end throws RuntimeException
> ------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: JBTM-2147
> URL: https://issues.jboss.org/browse/JBTM-2147
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: JTS
> Affects Versions: 4.17.7
> Environment: jboss EAP 6.1.1
> Reporter: Koen Janssens
> Assignee: Tom Jenkinson
>
> EDITED BY TOM (to remove the report about the transaction not being disassociated):
> If the reaper calls cancel on a transaction that is wedged, the transaction stays in 'aborting' state but it notifes Synchronizations that the tx has ended contrary to the spec.
> This can be reproduced as follows:
> * A TX is started and both a DB (last)resource and horntq (XA) resource get enlisted. The tx takes a long time and times out.
> * Arjuna reaper thread notices the time out and starts a worker thread to cancel the TX.
> * Before the worker thread can 'abort' the hornetq XA resource, the arjuna worker thread is interruped (txReaperCancelWaitPeriod expires) by arjuna
> * Since the worker thread is interruped, the hornetq XA resource throws an 'HornetQInterruptedException'
> * This unexpected exception causes arjuna to notify registered javax/transaction/Synchronization's (and return DB connection to the pool), without ending the Tx.
> ORIGINAL DESCRIPTION:
> We have noticed that arjuna leaks transactions when something 'unexpected' happens during tx aborting. The transaction stays in 'aborting' state while it does notify TxLIsteners that the tx has ended.
> This can be reproduced as follows:
> * A TX is started and both a DB (last)resource and horntq (XA) resource get enlisted. The tx takes a long time and times out.
> * Arjuna reaper thread notices the time out and starts a worker thread to cancel the TX.
> * Before the worker thread can 'abort' the hornetq XA resource, the arjuna worker thread is interruped (txReaperCancelWaitPeriod expires) by arjuna
> * Since the worker thread is interruped, the hornetq XA resource throws an 'HornetQInterruptedException'
> * This unexpected exception causes arjuna to notify registered javax/transaction/Synchronization's (and return DB connection to the pool), without ending the Tx.
> From then on, jboss logs are full of:
> Trying to start a new transaction when old is not complete: Old: < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff40bab98c:1f71a485:5335ece7:3625a31, node_name=1, branch_uid=0:ffff40bab98c:1f71a485:5335ece7:3625bbc, subordinatenodename=null, eis_name=java:/XAOracleDS >, New < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff40bab98c:1f71a485:5335ece7:3641efc, node_name=1, branch_uid=0:ffff40bab98c:1f71a485:5335ece7:36420a1, subordinatenodename=null, eis_name=java:/XAOracleDS >, Flags 0
> Although the root cause is a misbehaving hornetq resource, I think arjuna should be a bit more resilient.
> More background in https://access.redhat.com/support/cases/01061583/
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 7 months
[JBoss JIRA] (JBTM-2147) Transactions are leaked when XAResource misbehaves
by Tom Jenkinson (JIRA)
[ https://issues.jboss.org/browse/JBTM-2147?page=com.atlassian.jira.plugin.... ]
Tom Jenkinson reopened JBTM-2147:
---------------------------------
> Transactions are leaked when XAResource misbehaves
> --------------------------------------------------
>
> Key: JBTM-2147
> URL: https://issues.jboss.org/browse/JBTM-2147
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: JTS
> Affects Versions: 4.17.7
> Environment: jboss EAP 6.1.1
> Reporter: Koen Janssens
> Assignee: Tom Jenkinson
>
> We have noticed that arjuna leaks transactions when something 'unexpected' happens during tx aborting. The transaction stays in 'aborting' state while it does notify TxLIsteners that the tx has ended.
> This can be reproduced as follows:
> * A TX is started and both a DB (last)resource and horntq (XA) resource get enlisted. The tx takes a long time and times out.
> * Arjuna reaper thread notices the time out and starts a worker thread to cancel the TX.
> * Before the worker thread can 'abort' the hornetq XA resource, the arjuna worker thread is interruped (txReaperCancelWaitPeriod expires) by arjuna
> * Since the worker thread is interruped, the hornetq XA resource throws an 'HornetQInterruptedException'
> * This unexpected exception causes arjuna to notify registered javax/transaction/Synchronization's (and return DB connection to the pool), without ending the Tx.
> From then on, jboss logs are full of:
> Trying to start a new transaction when old is not complete: Old: < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff40bab98c:1f71a485:5335ece7:3625a31, node_name=1, branch_uid=0:ffff40bab98c:1f71a485:5335ece7:3625bbc, subordinatenodename=null, eis_name=java:/XAOracleDS >, New < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff40bab98c:1f71a485:5335ece7:3641efc, node_name=1, branch_uid=0:ffff40bab98c:1f71a485:5335ece7:36420a1, subordinatenodename=null, eis_name=java:/XAOracleDS >, Flags 0
> Although the root cause is a misbehaving hornetq resource, I think arjuna should be a bit more resilient.
> More background in https://access.redhat.com/support/cases/01061583/
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 7 months
[JBoss JIRA] (JBTM-2147) Transactions are leaked when XAResource misbehaves
by Tom Jenkinson (JIRA)
[ https://issues.jboss.org/browse/JBTM-2147?page=com.atlassian.jira.plugin.... ]
Tom Jenkinson commented on JBTM-2147:
-------------------------------------
Hi,
Thanks for the report. Your latest comment is slightly different to your initial one, but it is now something I am happy to fix. Effectively you are saying the a cached reference to a transaction (such as the reaper has) can cause the afterCompletion of Synchronizations to be triggered if an XAResource throws RuntimeException out of end.
What I propose is to fix that specific issue, i.e. prevent afterCompletion being called on subsequent attempts to rollback the transaction if it is broken by the XAR. You are correct that that is a state machine issue in Narayana that subsequent calls to rollback should not trigger afterCompletion.
I won't make further changes to the transaction manager to handle RuntimeExceptions from XAR::end and say disassociate the transaction from the thread as this is not spec compliant so the resource is the thing that should be fixed for that.
Note, you will still get a single ARJUNA012078: Abort called illegaly on atomic action 0:ffff40bab98c:1f71a485:5335ece7:3625a31) in the log, but that would be expected in this scenario.
Thanks again for the report and I hope you concur with my approach,
Tom
> Transactions are leaked when XAResource misbehaves
> --------------------------------------------------
>
> Key: JBTM-2147
> URL: https://issues.jboss.org/browse/JBTM-2147
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: JTS
> Affects Versions: 4.17.7
> Environment: jboss EAP 6.1.1
> Reporter: Koen Janssens
> Assignee: Tom Jenkinson
>
> We have noticed that arjuna leaks transactions when something 'unexpected' happens during tx aborting. The transaction stays in 'aborting' state while it does notify TxLIsteners that the tx has ended.
> This can be reproduced as follows:
> * A TX is started and both a DB (last)resource and horntq (XA) resource get enlisted. The tx takes a long time and times out.
> * Arjuna reaper thread notices the time out and starts a worker thread to cancel the TX.
> * Before the worker thread can 'abort' the hornetq XA resource, the arjuna worker thread is interruped (txReaperCancelWaitPeriod expires) by arjuna
> * Since the worker thread is interruped, the hornetq XA resource throws an 'HornetQInterruptedException'
> * This unexpected exception causes arjuna to notify registered javax/transaction/Synchronization's (and return DB connection to the pool), without ending the Tx.
> From then on, jboss logs are full of:
> Trying to start a new transaction when old is not complete: Old: < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff40bab98c:1f71a485:5335ece7:3625a31, node_name=1, branch_uid=0:ffff40bab98c:1f71a485:5335ece7:3625bbc, subordinatenodename=null, eis_name=java:/XAOracleDS >, New < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff40bab98c:1f71a485:5335ece7:3641efc, node_name=1, branch_uid=0:ffff40bab98c:1f71a485:5335ece7:36420a1, subordinatenodename=null, eis_name=java:/XAOracleDS >, Flags 0
> Although the root cause is a misbehaving hornetq resource, I think arjuna should be a bit more resilient.
> More background in https://access.redhat.com/support/cases/01061583/
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 7 months