XAResource problem was Re: Naming test timeout was Re: [jboss-dev] PUSH

Andrew Dinn adinn at redhat.com
Thu Jun 26 12:55:47 EDT 2008


Adrian Brock wrote:
> The first mention of this "broken" XAResource looks to be after the
> ha invoker tests, can you have a look at the log to see
> if there are any other clues?
> 
> This is about 13 minutes before the naming test hangs.
> 
> http://hudson.jboss.org/hudson/job/JBoss-AS-5.0.x-TestSuite-sun15/lastBuild/artifact/trunk/build/output/jboss-5.0.0.CR1/server/all/log/server.log

Well, working backwards from the first occurence of the recovery 
failures I found this report of a heuristic outcome:


2008-06-26 03:50:58,957 INFO 
[org.jboss.test.jca.adapter.TestManagedConnection] 
(WorkerThread#39[10.18.94.175:52882]) rollback with 
xid=1-a120086:bcb0:486344fa:3ea2 for TestManagedConnection#1{xid=null 
destroyed=false}
2008-06-26 03:50:58,969 ERROR [org.jboss.ejb.plugins.LogInterceptor] 
(WorkerThread#39[10.18.94.175:52882]) 
TransactionRolledbackLocalException in method: public abstract void 
org.jboss.test.jca.interfaces.XAExceptionSessionLocal.testXAExceptionToTransactionRolledbackException(), 
causedBy:
javax.transaction.HeuristicMixedException
	at 
com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commit(TransactionImple.java:252)
	at 
org.jboss.ejb.plugins.TxInterceptorCMT.endTransaction(TxInterceptorCMT.java:501)
	at 
org.jboss.ejb.plugins.TxInterceptorCMT.runWithTransactions(TxInterceptorCMT.java:361)
	. . .
2008-06-26 03:50:58,970 INFO 
[org.jboss.test.jca.ejb.XAExceptionTestSessionBean] 
(WorkerThread#39[10.18.94.175:52882]) Test worked

The next mention of the associated transaction 
(a120086:bcb0:486344fa:3ea2) is in the recovery failure message

2008-06-26 03:52:18,342 DEBUG 
[com.arjuna.ats.arjuna.logging.arjLoggerI18N] 
(Server.Connection:10.18.0.134:34373) 
[com.arjuna.ats.arjuna.recovery.ActionStatusService_1] - 
transactionType: 
/StateManager/BasicAction/TwoPhaseCoordinator/AtomicAction uid: 
a120086:bcb0:486344fa:3ea2
  Status is 7
2008-06-26 03:52:18,344 WARN  [com.arjuna.ats.jta.logging.loggerI18N] 
(Thread-19) 
[com.arjuna.ats.internal.jta.resources.arjunacore.norecoveryxa] 
[com.arjuna.ats.internal.jta.resources.arjunacore.norecoveryxa] Could 
not find new XAResource to use for recovering non-serializable 
XAResource < 131075, 28, 26, 
4945974950484856545898999848585256545152521029758511019750974950484856545898999848585256545152521029758511019755 
 >
2008-06-26 03:52:18,344 WARN 
[com.arjuna.ats.arjuna.logging.arjLoggerI18N] (Thread-19) 
[com.arjuna.ats.internal.arjuna.gandiva.inventory.StaticInventory_1] - 
cannot find null implementation.
2008-06-26 03:52:18,344 WARN 
[com.arjuna.ats.arjuna.logging.arjLoggerI18N] (Thread-19) 
[com.arjuna.ats.arjuna.recovery.RecoverAtomicAction_4] - 
RecoverAtomicAction: transaction a120086:bcb0:486344fa:3ea2 not 
activated, unable to replay phase 2 commit

Now, when the heuristic outcome happens the failed TX is written to the 
object store in order that it can be retrieved for later analysis. The 
error happens when the recovery thread tries to scan it and finds a 
corrupt entry. So, somewhere between these two messages the object store 
appears to have been trashed. The same appears to be the case for the 
other TXs which also have commitwhenaborted failures.

In between the two messages there are a whole series of errors with 
messages from 
org.jboss.test.jca.adapter.TestManagedConnectionFactory.createManagedConnection
Throwable while attempting to get a new connection: null

followed by yet more ResourceExceptions

javax.resource.ResourceException:
	at 
org.jboss.test.jca.adapter.TestManagedConnectionFactory.matchManagedConnections(TestManagedConnectionFactory.java:150)


and even RuntimeExceptions

java.lang.RuntimeException:
	at 
org.jboss.test.jca.adapter.TestManagedConnectionFactory.matchManagedConnections(TestManagedConnectionFactory.java:152)

These are also followed by some TX is not active and rollback errors. I 
have no idea if these are related. I doubt it though, because none of 
them should have any effect on the object store contents for the now 
rolled back heuristic TX. It's dead, Jim -- at least until recovery 
kicks in.

Is this running clustered? Are the clusters using independent object 
stores? Could there be a caching issue here which allows another server 
to think the TX is active? This is probably a more fruitful line of 
enquiry than anything I can see in the log.

regards,



Andrew Dinn
-----------



More information about the jboss-development mailing list