[infinispan-issues] [JBoss JIRA] (ISPN-2719) SyncCacheListenerTest.testRemoteCacheListener failing randomly
Galder Zamarreño (JIRA)
jira-events at lists.jboss.org
Mon Jan 21 09:03:22 EST 2013
[ https://issues.jboss.org/browse/ISPN-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749191#comment-12749191 ]
Galder Zamarreño commented on ISPN-2719:
----------------------------------------
The root exception is:
{code}org.infinispan.CacheException: Remote transaction for global transaction
(RecoveryAwareGlobalTransaction{xid=< 131077, 29, 36, 0000000000-1-11033-112-11200-6-480-10-12887002812549, 0000000000-1-11033-112-11200-6-480-10-12887002812600000000 >, internalId=281479271677955} GlobalTransaction:<SyncCacheListenerTest-NodeA-49251>:7943:remote) not found
at org.infinispan.transaction.xa.recovery.RecoveryAwareTransactionTable.remoteTransactionPrepared(RecoveryAwareTransactionTable.java:73)
at org.infinispan.interceptors.TxInterceptor.visitPrepareCommand(TxInterceptor.java:123)
at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:132)
{code}
I think there's a bug in the way recovery information is removed from the cluster. What's going on here is the following:
Both prepare and TxCompletionNotificationCommand are sent in paralel (prepare in OOB2-x thread and notification in OOB1-x thread) and the tx completion notification removes the transaction from the RecoveryAwareTransactionTable before the prepare has finished doing its job, see snippet for detailed log lines: https://gist.github.com/94fc78ec5cc42a821471
Based on previous log messages, we see:
{code}2013-01-16 11:30:39,921 258299 TRACE [org.infinispan.transaction.xa.recovery.RecoveryManagerImpl] (testng-SyncCacheListenerTest:)
Forgetting tx information for RecoveryAwareGlobalTransaction{xid=< 131077, 29, 36, 0000000000-1-11033-112-11200-6-480-10-12887002811749, 0000000000-1-11033-112-11200-6-480-10-12887002811800000000 >, internalId=281479271677954} GlobalTransaction:<SyncCacheListenerTest-NodeA-49251>:7941:local{code}
Which means that removal comes from the sender node executing RecoveryManagerImpl.removeRecoveryInformationFromCluster, and based on the log calls around it, it's coming from either:
a) TransactionXaAdapter.forgetSuccessfullyCompletedTransaction or
b) RecoveryManagerImpl.completeTransaction
Both calls have hardcoded sync=false as parameter, so there's a bug here: either:
a) wherever that call is coming from should set sync=true by default
b) or should the value of the parameter be manageable by a configuration where the default is true to avoid transactions failing for no reason.
c) is the TX notification command being sent far too soon? Shouldn't it be sent after prepare()? The configuration used is:
{code}ConfigurationBuilder builder =
getDefaultClusteredCacheConfig(CacheMode.REPL_SYNC, true);
builder.locking().isolationLevel(IsolationLevel.SERIALIZABLE)
.lockAcquisitionTimeout(5000)
// TODO: Another case of default values changed (see ISPN-2651)
.transaction().useSynchronization(false);{code}
If you look at getDefaultClusteredCacheConfig(), it does already set the two parameters needed to make sure commit/rollback phases are synch:
{code}.transaction().syncCommitPhase(true).syncRollbackPhase(true){code}
Assigning to Mircea who's the expert on this.
> SyncCacheListenerTest.testRemoteCacheListener failing randomly
> --------------------------------------------------------------
>
> Key: ISPN-2719
> URL: https://issues.jboss.org/browse/ISPN-2719
> Project: Infinispan
> Issue Type: Bug
> Components: Listeners
> Reporter: Galder Zamarreño
> Assignee: Galder Zamarreño
> Labels: testsuite_stability
> Fix For: 5.2.0.Final
>
> Attachments: testRemoteCacheListener-5.log
>
>
> {code}testRemoteCacheListener(org.infinispan.replication.SyncCacheListenerTest) Time elapsed: 0.059 sec <<< FAILURE!
> org.infinispan.CacheException: Could not commit implicit transaction
> at org.infinispan.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1159)
> at org.infinispan.CacheImpl.removeInternal(CacheImpl.java:310)
> at org.infinispan.CacheImpl.remove(CacheImpl.java:304)
> at org.infinispan.CacheImpl.remove(CacheImpl.java:298)
> at org.infinispan.replication.SyncCacheListenerTest.testRemoteCacheListener(SyncCacheListenerTest.java:113)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
> at org.testng.internal.Invoker.invokeMethod(Invoker.java:715)
> at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:907)
> at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1237)
> at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
> at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
> at org.testng.TestRunner.privateRun(TestRunner.java:767)
> at org.testng.TestRunner.run(TestRunner.java:617)
> at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
> at org.testng.SuiteRunner.access$000(SuiteRunner.java:37)
> at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:368)
> at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:680)
> Caused by: javax.transaction.RollbackException: ARJUNA016053: Could not commit transaction.
> at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1177)
> at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:117)
> at org.infinispan.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1156)
> ... 25 more{code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the infinispan-issues
mailing list