[jbossts-issues] [JBoss JIRA] Commented: (JBTM-226) Possible race condition between top-down and bottom-up recovery

Marko Kanala (JIRA) jira-events at lists.jboss.org
Thu Oct 4 03:08:03 EDT 2007


    [ http://jira.jboss.com/jira/browse/JBTM-226?page=comments#action_12380349 ] 
            
Marko Kanala commented on JBTM-226:
-----------------------------------

Could this kind of a problem exist on 4.2.0-GA version ? A production environment I'm involved with stalled mysteriously from 100% cpu usage with no clear indication for the fault:

...
2007-10-04 03:10:15,254 DEBUG [org.jboss.ejb.StatelessSessionContainer] Useless invocation of remove() for stateless session bean
2007-10-04 03:11:59,427 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] Periodic recovery - first pass <to, 4 loka 2007 03:11:59>
2007-10-04 03:11:59,427 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] StatusModule: first pass 
2007-10-04 03:11:59,427 DEBUG [com.arjuna.ats.txoj.logging.txojLoggerI18N] [com.arjuna.ats.internal.txoj.recovery.TORecoveryModule_3] - TORecoveryModule - first pass
2007-10-04 03:11:59,428 DEBUG [com.arjuna.ats.jta.logging.loggerI18N] [com.arjuna.ats.internal.jta.recovery.info.firstpass] Local XARecoveryModule - first pass
2007-10-04 03:12:09,437 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] Periodic recovery - second pass <to, 4 loka 2007 03:12:09>
2007-10-04 03:12:09,437 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] AtomicActionRecoveryModule: Second pass 
2007-10-04 03:12:09,437 DEBUG [com.arjuna.ats.txoj.logging.txojLoggerI18N] [com.arjuna.ats.internal.txoj.recovery.TORecoveryModule_6] - TORecoveryModule - second pass
2007-10-04 03:12:09,437 DEBUG [com.arjuna.ats.jta.logging.loggerI18N] [com.arjuna.ats.internal.jta.recovery.info.secondpass] Local XARecoveryModule - second pass
2007-10-04 03:12:32,297 DEBUG [org.jboss.resource.connectionmanager.IdleRemover] run: IdleRemover notifying pools, interval: 150000
2007-10-04 03:14:09,438 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] Periodic recovery - first pass <to, 4 loka 2007 03:14:09>
2007-10-04 03:14:09,438 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] StatusModule: first pass 
2007-10-04 03:14:09,438 DEBUG [com.arjuna.ats.txoj.logging.txojLoggerI18N] [com.arjuna.ats.internal.txoj.recovery.TORecoveryModule_3] - TORecoveryModule - first pass
2007-10-04 03:14:09,439 DEBUG [com.arjuna.ats.jta.logging.loggerI18N] [com.arjuna.ats.internal.jta.recovery.info.firstpass] Local XARecoveryModule - first pass
2007-10-04 03:14:19,448 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] Periodic recovery - second pass <to, 4 loka 2007 03:14:19>
2007-10-04 03:14:19,448 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] AtomicActionRecoveryModule: Second pass 
2007-10-04 03:14:19,448 DEBUG [com.arjuna.ats.txoj.logging.txojLoggerI18N] [com.arjuna.ats.internal.txoj.recovery.TORecoveryModule_6] - TORecoveryModule - second pass
2007-10-04 03:14:19,448 DEBUG [com.arjuna.ats.jta.logging.loggerI18N] [com.arjuna.ats.internal.jta.recovery.info.secondpass] Local XARecoveryModule - second pass
2007-10-04 03:15:02,308 DEBUG [org.jboss.resource.connectionmanager.IdleRemover] run: IdleRemover notifying pools, interval: 150000
2007-10-04 03:16:19,449 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] Periodic recovery - first pass <to, 4 loka 2007 03:16:19>
2007-10-04 03:16:19,449 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] StatusModule: first pass 
2007-10-04 03:16:19,450 DEBUG [com.arjuna.ats.txoj.logging.txojLoggerI18N] [com.arjuna.ats.internal.txoj.recovery.TORecoveryModule_3] - TORecoveryModule - first pass
2007-10-04 03:16:19,450 DEBUG [com.arjuna.ats.jta.logging.loggerI18N] [com.arjuna.ats.internal.jta.recovery.info.firstpass] Local XARecoveryModule - first pass
2007-10-04 03:16:29,459 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] Periodic recovery - second pass <to, 4 loka 2007 03:16:29>
2007-10-04 03:16:29,459 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] AtomicActionRecoveryModule: Second pass 
2007-10-04 03:16:29,459 DEBUG [com.arjuna.ats.txoj.logging.txojLoggerI18N] [com.arjuna.ats.internal.txoj.recovery.TORecoveryModule_6] - TORecoveryModule - second pass
2007-10-04 03:16:29,460 DEBUG [com.arjuna.ats.jta.logging.loggerI18N] [com.arjuna.ats.internal.jta.recovery.info.secondpass] Local XARecoveryModule - second pass
2007-10-04 03:17:32,320 DEBUG [org.jboss.resource.connectionmanager.IdleRemover] run: IdleRemover notifying pools, interval: 150000
2007-10-04 03:18:29,460 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] Periodic recovery - first pass <to, 4 loka 2007 03:18:29>
2007-10-04 03:18:29,461 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] StatusModule: first pass 
2007-10-04 03:18:29,461 DEBUG [com.arjuna.ats.txoj.logging.txojLoggerI18N] [com.arjuna.ats.internal.txoj.recovery.TORecoveryModule_3] - TORecoveryModule - first pass
2007-10-04 03:18:29,461 DEBUG [com.arjuna.ats.jta.logging.loggerI18N] [com.arjuna.ats.internal.jta.recovery.info.firstpass] Local XARecoveryModule - first pass
2007-10-04 03:18:39,470 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] Periodic recovery - second pass <to, 4 loka 2007 03:18:39>
2007-10-04 03:18:39,470 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] AtomicActionRecoveryModule: Second pass 
2007-10-04 03:18:39,471 DEBUG [com.arjuna.ats.txoj.logging.txojLoggerI18N] [com.arjuna.ats.internal.txoj.recovery.TORecoveryModule_6] - TORecoveryModule - second pass
2007-10-04 03:18:39,471 DEBUG [com.arjuna.ats.jta.logging.loggerI18N] [com.arjuna.ats.internal.jta.recovery.info.secondpass] Local XARecoveryModule - second pass
2007-10-04 03:20:02,491 DEBUG [org.jboss.resource.connectionmanager.IdleRemover] run: IdleRemover notifying pools, interval: 150000
2007-10-04 03:20:39,472 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] Periodic recovery - first pass <to, 4 loka 2007 03:20:39>
2007-10-04 03:20:39,472 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] StatusModule: first pass 
2007-10-04 03:20:39,472 DEBUG [com.arjuna.ats.txoj.logging.txojLoggerI18N] [com.arjuna.ats.internal.txoj.recovery.TORecoveryModule_3] - TORecoveryModule - first pass
2007-10-04 03:20:39,473 DEBUG [com.arjuna.ats.jta.logging.loggerI18N] [com.arjuna.ats.internal.jta.recovery.info.firstpass] Local XARecoveryModule - first pass
2007-10-04 03:20:49,482 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] Periodic recovery - second pass <to, 4 loka 2007 03:20:49>
2007-10-04 03:20:49,482 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] AtomicActionRecoveryModule: Second pass 
2007-10-04 03:20:49,482 DEBUG [com.arjuna.ats.txoj.logging.txojLoggerI18N] [com.arjuna.ats.internal.txoj.recovery.TORecoveryModule_6] - TORecoveryModule - second pass
2007-10-04 03:20:49,482 DEBUG [com.arjuna.ats.jta.logging.loggerI18N] [com.arjuna.ats.internal.jta.recovery.info.secondpass] Local XARecoveryModule - second pass
2007-10-04 03:22:32,493 DEBUG [org.jboss.resource.connectionmanager.IdleRemover] run: IdleRemover notifying pools, interval: 150000
2007-10-04 03:22:49,484 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] Periodic recovery - first pass <to, 4 loka 2007 03:22:49>
2007-10-04 03:22:49,484 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] StatusModule: first pass 
2007-10-04 03:22:49,484 DEBUG [com.arjuna.ats.txoj.logging.txojLoggerI18N] [com.arjuna.ats.internal.txoj.recovery.TORecoveryModule_3] - TORecoveryModule - first pass
2007-10-04 03:22:49,485 DEBUG [com.arjuna.ats.jta.logging.loggerI18N] [com.arjuna.ats.internal.jta.recovery.info.firstpass] Local XARecoveryModule - first pass
2007-10-04 03:22:59,494 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] Periodic recovery - second pass <to, 4 loka 2007 03:22:59>
2007-10-04 03:22:59,494 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] AtomicActionRecoveryModule: Second pass 
2007-10-04 03:22:59,507 DEBUG [com.arjuna.ats.txoj.logging.txojLoggerI18N] [com.arjuna.ats.internal.txoj.recovery.TORecoveryModule_6] - TORecoveryModule - second pass
2007-10-04 03:22:59,507 DEBUG [com.arjuna.ats.jta.logging.loggerI18N] [com.arjuna.ats.internal.jta.recovery.info.secondpass] Local XARecoveryModule - second pass
2007-10-04 03:24:59,515 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] Periodic recovery - first pass <to, 4 loka 2007 03:24:59>
2007-10-04 03:24:59,515 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] StatusModule: first pass 
2007-10-04 03:24:59,515 DEBUG [com.arjuna.ats.txoj.logging.txojLoggerI18N] [com.arjuna.ats.internal.txoj.recovery.TORecoveryModule_3] - TORecoveryModule - first pass
2007-10-04 03:24:59,516 DEBUG [com.arjuna.ats.jta.logging.loggerI18N] [com.arjuna.ats.internal.jta.recovery.info.firstpass] Local XARecoveryModule - first pass
2007-10-04 03:25:02,495 DEBUG [org.jboss.resource.connectionmanager.IdleRemover] run: IdleRemover notifying pools, interval: 150000
2007-10-04 03:25:09,525 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] Periodic recovery - second pass <to, 4 loka 2007 03:25:09>
2007-10-04 03:25:09,525 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] AtomicActionRecoveryModule: Second pass 
2007-10-04 03:25:09,525 DEBUG [com.arjuna.ats.txoj.logging.txojLoggerI18N] [com.arjuna.ats.internal.txoj.recovery.TORecoveryModule_6] - TORecoveryModule - second pass
2007-10-04 03:25:09,525 DEBUG [com.arjuna.ats.jta.logging.loggerI18N] [com.arjuna.ats.internal.jta.recovery.info.secondpass] Local XARecoveryModule - second pass
2007-10-04 03:25:28,775 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] 
  --- ExpiredEntryMonitor ----to, 4 loka 2007 03:25:28----
2007-10-04 03:27:09,526 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] Periodic recovery - first pass <to, 4 loka 2007 03:27:09>
2007-10-04 03:27:09,526 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] StatusModule: first pass 
2007-10-04 03:27:09,527 DEBUG [com.arjuna.ats.txoj.logging.txojLoggerI18N] [com.arjuna.ats.internal.txoj.recovery.TORecoveryModule_3] - TORecoveryModule - first pass
2007-10-04 03:27:09,527 DEBUG [com.arjuna.ats.jta.logging.loggerI18N] [com.arjuna.ats.internal.jta.recovery.info.firstpass] Local XARecoveryModule - first pass
2007-10-04 03:27:19,536 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] Periodic recovery - second pass <to, 4 loka 2007 03:27:19>
2007-10-04 03:27:19,536 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] AtomicActionRecoveryModule: Second pass 
2007-10-04 03:27:19,536 DEBUG [com.arjuna.ats.txoj.logging.txojLoggerI18N] [com.arjuna.ats.internal.txoj.recovery.TORecoveryModule_6] - TORecoveryModule - second pass
2007-10-04 03:27:19,537 DEBUG [com.arjuna.ats.jta.logging.loggerI18N] [com.arjuna.ats.internal.jta.recovery.info.secondpass] Local XARecoveryModule - second pass
2007-10-04 03:27:32,496 DEBUG [org.jboss.resource.connectionmanager.IdleRemover] run: IdleRemover notifying pools, interval: 150000
2007-10-04 03:29:19,537 DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] Periodic recovery - first pass <to, 4 loka 2007 03:29:19>
...

The jboss had to be restarted to get rid of the 100% cpu usage.


> Possible race condition between top-down and bottom-up recovery
> ---------------------------------------------------------------
>
>                 Key: JBTM-226
>                 URL: http://jira.jboss.com/jira/browse/JBTM-226
>             Project: JBoss Transaction Manager
>          Issue Type: Bug
>      Security Level: Public(Everyone can see) 
>          Components: JTA Implementation
>    Affects Versions: 4.2.3.SP4
>         Environment: Mac OS X, JDK 1.4.2.
>            Reporter: Mark Little
>         Assigned To: Mark Little
>            Priority: Critical
>             Fix For: 4.6
>
>
> In local JTA recovery for non-serializable XAResources, the XAResourceRecovery implementation is needed to obtain a new XAResource. If a new resource isn't obtained (because the XAResourceRecovery instances haven't all been triggered yet), top-down recovery retries on subsequent recovery sweeps. Meanwhile, bottom up recovery can kick in and use an XAResource from a XAResourceRecovery that is now available and fail to locate any information on the transaction that will drive it top-down. Upon failing to find the transaction, it uses presumed abort and rolls back the instance, when in fact it should wait and commit it. But it doesn't do that until several phases have passed, and in which case most of the time top-down recovery will have re-run, found the XAResource and committed the transaction.
> I'm fairly sure this is a race condition and does exist, but am still checking. Work around exists at the moment though.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        



More information about the jbossts-issues mailing list