[
https://issues.jboss.org/browse/ISPN-2719?page=com.atlassian.jira.plugin....
]
Galder Zamarreño commented on ISPN-2719:
----------------------------------------
The root exception is:
{code}org.infinispan.CacheException: Remote transaction for global transaction
(RecoveryAwareGlobalTransaction{xid=< 131077, 29, 36,
0000000000-1-11033-112-11200-6-480-10-12887002812549,
0000000000-1-11033-112-11200-6-480-10-12887002812600000000 >,
internalId=281479271677955}
GlobalTransaction:<SyncCacheListenerTest-NodeA-49251>:7943:remote) not found
at
org.infinispan.transaction.xa.recovery.RecoveryAwareTransactionTable.remoteTransactionPrepared(RecoveryAwareTransactionTable.java:73)
at
org.infinispan.interceptors.TxInterceptor.visitPrepareCommand(TxInterceptor.java:123)
at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:132)
{code}
I think there's a bug in the way recovery information is removed from the cluster.
What's going on here is the following:
Both prepare and TxCompletionNotificationCommand are sent in paralel (prepare in OOB2-x
thread and notification in OOB1-x thread) and the tx completion notification removes the
transaction from the RecoveryAwareTransactionTable before the prepare has finished doing
its job, see snippet for detailed log lines:
https://gist.github.com/94fc78ec5cc42a821471
Based on previous log messages, we see:
{code}2013-01-16 11:30:39,921 258299 TRACE
[org.infinispan.transaction.xa.recovery.RecoveryManagerImpl]
(testng-SyncCacheListenerTest:)
Forgetting tx information for RecoveryAwareGlobalTransaction{xid=< 131077, 29, 36,
0000000000-1-11033-112-11200-6-480-10-12887002811749,
0000000000-1-11033-112-11200-6-480-10-12887002811800000000 >,
internalId=281479271677954}
GlobalTransaction:<SyncCacheListenerTest-NodeA-49251>:7941:local{code}
Which means that removal comes from the sender node executing
RecoveryManagerImpl.removeRecoveryInformationFromCluster, and based on the log calls
around it, it's coming from either:
a) TransactionXaAdapter.forgetSuccessfullyCompletedTransaction or
b) RecoveryManagerImpl.completeTransaction
Both calls have hardcoded sync=false as parameter, so there's a bug here: either:
a) wherever that call is coming from should set sync=true by default
b) or should the value of the parameter be manageable by a configuration where the default
is true to avoid transactions failing for no reason.
c) is the TX notification command being sent far too soon? Shouldn't it be sent after
prepare()? The configuration used is:
{code}ConfigurationBuilder builder =
getDefaultClusteredCacheConfig(CacheMode.REPL_SYNC, true);
builder.locking().isolationLevel(IsolationLevel.SERIALIZABLE)
.lockAcquisitionTimeout(5000)
// TODO: Another case of default values changed (see ISPN-2651)
.transaction().useSynchronization(false);{code}
If you look at getDefaultClusteredCacheConfig(), it does already set the two parameters
needed to make sure commit/rollback phases are synch:
{code}.transaction().syncCommitPhase(true).syncRollbackPhase(true){code}
Assigning to Mircea who's the expert on this.
SyncCacheListenerTest.testRemoteCacheListener failing randomly
--------------------------------------------------------------
Key: ISPN-2719
URL:
https://issues.jboss.org/browse/ISPN-2719
Project: Infinispan
Issue Type: Bug
Components: Listeners
Reporter: Galder Zamarreño
Assignee: Galder Zamarreño
Labels: testsuite_stability
Fix For: 5.2.0.Final
Attachments: testRemoteCacheListener-5.log
{code}testRemoteCacheListener(org.infinispan.replication.SyncCacheListenerTest) Time
elapsed: 0.059 sec <<< FAILURE!
org.infinispan.CacheException: Could not commit implicit transaction
at org.infinispan.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1159)
at org.infinispan.CacheImpl.removeInternal(CacheImpl.java:310)
at org.infinispan.CacheImpl.remove(CacheImpl.java:304)
at org.infinispan.CacheImpl.remove(CacheImpl.java:298)
at
org.infinispan.replication.SyncCacheListenerTest.testRemoteCacheListener(SyncCacheListenerTest.java:113)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
at org.testng.internal.Invoker.invokeMethod(Invoker.java:715)
at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:907)
at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1237)
at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
at org.testng.TestRunner.privateRun(TestRunner.java:767)
at org.testng.TestRunner.run(TestRunner.java:617)
at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
at org.testng.SuiteRunner.access$000(SuiteRunner.java:37)
at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:368)
at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: javax.transaction.RollbackException: ARJUNA016053: Could not commit
transaction.
at
com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1177)
at
com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:117)
at org.infinispan.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1156)
... 25 more{code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:
http://www.atlassian.com/software/jira