[JBoss JIRA] (ISPN-2974) DeltaAware based fine-grained replication corrupts cache data, if eviction is enabled
by Paul Ferraro (JIRA)
[ https://issues.jboss.org/browse/ISPN-2974?page=com.atlassian.jira.plugin.... ]
Paul Ferraro commented on ISPN-2974:
------------------------------------
Please backport this fix to the 5.2.x branch.
> DeltaAware based fine-grained replication corrupts cache data, if eviction is enabled
> -------------------------------------------------------------------------------------
>
> Key: ISPN-2974
> URL: https://issues.jboss.org/browse/ISPN-2974
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 5.2.1.Final, 5.2.5.Final
> Reporter: Horia Chiorean
> Assignee: Adrian Nistor
> Priority: Critical
> Labels: 5.2.x
> Fix For: 5.3.0.Beta1
>
>
> When using a custom {{DeltaAware}} implementation in a cluster with 2 replicated nodes with eviction enabled, data transferred from one node (the writer) to the another (the reader) causes data stored on this node and evicted at the time of the change, to be rewritten with whatever the partial latest delta was.
> In more detail:
> * configure 2 nodes in replicated mode, with eviction enabled
> * consider NodeA the writer and NodeB the reader
> * NodeA inserts some data (custom entries) into the cache
> * NodeB correctly receives via state transfer the initial data
> * NodeA loads & partially updates some information about an entry which was not in the cache - was evicted previously
> * NodeB receives the partial delta with the changes from NodeA, but *instead of merging* with whatever is stored in the persistent store, *replaces the entire entry in the cache*, leaving it in effect with "partial/corrupt information"
> If eviction is not enabled, everything works as expected.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 8 months
[JBoss JIRA] (ISPN-3063) Data Inconsistency when Recovery + syncCommitPhase=false
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3063?page=com.atlassian.jira.plugin.... ]
Mircea Markus commented on ISPN-3063:
-------------------------------------
answering myself: when recovery is enabled, it's the tx completion notification cleans up the recovery information from cluster. As the commit is async, the transaction manager has no way to know if the transaction failed during the commit - situation in which it would use the recovery mechanism. So I don't see much sense for having syncCommitPhase=false && recovery enabled, hence I suggest taking your first solution and not allowing "syncCommitPhase=false && recovery enabled".
> Data Inconsistency when Recovery + syncCommitPhase=false
> --------------------------------------------------------
>
> Key: ISPN-3063
> URL: https://issues.jboss.org/browse/ISPN-3063
> Project: Infinispan
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 5.2.5.Final, 5.3.0.Alpha1
> Reporter: Pedro Ruivo
> Assignee: Mircea Markus
> Labels: recovery, transaction
>
> with syncCommitPhase=false, the CommitCommand is sent asynchronously. the TransactionCoordinator sends immediately the TxCompletionNotificationCommand that can be deliver first than the CommitCommand. The CommitCommand fails silently:
> {code}
> if (transaction == null) {
> if (trace) log.tracef("Did not find a RemoteTransaction for %s", globalTx);
> return invalidRemoteTxReturnValue();
> }
> }
> {code}
> This bug affects the 5.3 and 5.2.5. I've made one test case to catch this bug:
> 5.3 => https://github.com/pruivo/infinispan/blob/rec-async/core/src/test/java/or...
> 5.2 => https://github.com/pruivo/infinispan/blob/rec-async-5.2/core/src/test/jav...
> Note: this bug may happen if you use async communication (prepare in 1PC)
> Note2: this may be related to https://issues.jboss.org/browse/ISPN-2719
> Possible solutions:
> * do not allow to configure the cache with syncCommitPhase=false && recovery enabled;
> * force syncCommitPhase=true when recovery is enabled;
> * send the CommitCommand and the TxCompletionNotificationCommand as Regular Messages (they will be deliver in FIFO order)
> Thanks to Diego Didona that spotted this bug.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 8 months
[JBoss JIRA] (ISPN-3063) Data Inconsistency when Recovery + syncCommitPhase=false
by Pedro Ruivo (JIRA)
[ https://issues.jboss.org/browse/ISPN-3063?page=com.atlassian.jira.plugin.... ]
Pedro Ruivo commented on ISPN-3063:
-----------------------------------
not really. I didn't test without the recovery, but the logic is different:
{code}
//TransactionXaAdapter.forgetSuccessfullyCompletedTransaction()
if (recoveryEnabled) {
recoveryManager.removeRecoveryInformationFromCluster(localTransaction.getRemoteLocksAcquired(), xid, false, gtx);
txTable.removeLocalTransaction(localTransaction);
} else {
releaseLocksForCompletedTransaction(localTransaction);
}
{code}
and the releaseLocksForCompletedTransaction has the condition:
{code}
if (mayHaveRemoteLocks(localTransaction) && !isSecondPhaseAsync) {
...
}
{code}
> Data Inconsistency when Recovery + syncCommitPhase=false
> --------------------------------------------------------
>
> Key: ISPN-3063
> URL: https://issues.jboss.org/browse/ISPN-3063
> Project: Infinispan
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 5.2.5.Final, 5.3.0.Alpha1
> Reporter: Pedro Ruivo
> Assignee: Mircea Markus
> Labels: recovery, transaction
>
> with syncCommitPhase=false, the CommitCommand is sent asynchronously. the TransactionCoordinator sends immediately the TxCompletionNotificationCommand that can be deliver first than the CommitCommand. The CommitCommand fails silently:
> {code}
> if (transaction == null) {
> if (trace) log.tracef("Did not find a RemoteTransaction for %s", globalTx);
> return invalidRemoteTxReturnValue();
> }
> }
> {code}
> This bug affects the 5.3 and 5.2.5. I've made one test case to catch this bug:
> 5.3 => https://github.com/pruivo/infinispan/blob/rec-async/core/src/test/java/or...
> 5.2 => https://github.com/pruivo/infinispan/blob/rec-async-5.2/core/src/test/jav...
> Note: this bug may happen if you use async communication (prepare in 1PC)
> Note2: this may be related to https://issues.jboss.org/browse/ISPN-2719
> Possible solutions:
> * do not allow to configure the cache with syncCommitPhase=false && recovery enabled;
> * force syncCommitPhase=true when recovery is enabled;
> * send the CommitCommand and the TxCompletionNotificationCommand as Regular Messages (they will be deliver in FIFO order)
> Thanks to Diego Didona that spotted this bug.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 8 months
[JBoss JIRA] (ISPN-3020) FileCacheStoreTest intermittent failure
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-3020?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-3020:
-------------------------------
Status: Resolved (was: Pull Request Sent)
Fix Version/s: 5.3.0.Beta1
Resolution: Done
> FileCacheStoreTest intermittent failure
> ---------------------------------------
>
> Key: ISPN-3020
> URL: https://issues.jboss.org/browse/ISPN-3020
> Project: Infinispan
> Issue Type: Feature Request
> Components: Test Suite
> Affects Versions: 5.2.5.Final
> Reporter: Mircea Markus
> Assignee: Mircea Markus
> Fix For: 5.3.0.Beta1, 5.3.0.Final
>
>
> java.lang.AssertionError
> at org.infinispan.loaders.file.FileCacheStoreTest.testPurgeExpired(FileCacheStoreTest.java:94)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
> at org.testng.internal.Invoker.invokeMethod(Invoker.java:714)
> at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
> at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
> at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
> at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
> at org.testng.TestRunner.privateRun(TestRunner.java:767)
> at org.testng.TestRunner.run(TestRunner.java:617)
> at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
> at org.testng.SuiteRunner.access$000(SuiteRunner.java:37)
> at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:368)
> at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> at java.lang.Thread.run(Thread.java:662)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 8 months
[JBoss JIRA] (ISPN-3063) Data Inconsistency when Recovery + syncCommitPhase=false
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3063?page=com.atlassian.jira.plugin.... ]
Mircea Markus edited comment on ISPN-3063 at 4/30/13 2:06 PM:
--------------------------------------------------------------
I wonder why this happens only when recovery is enabled, as even the recovery is disabled the logic is the same.
was (Author: mircea.markus):
I wonder why this happens only when recovery is enabled, as even with recovery disabled the logic is the same.
> Data Inconsistency when Recovery + syncCommitPhase=false
> --------------------------------------------------------
>
> Key: ISPN-3063
> URL: https://issues.jboss.org/browse/ISPN-3063
> Project: Infinispan
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 5.2.5.Final, 5.3.0.Alpha1
> Reporter: Pedro Ruivo
> Assignee: Mircea Markus
> Labels: recovery, transaction
>
> with syncCommitPhase=false, the CommitCommand is sent asynchronously. the TransactionCoordinator sends immediately the TxCompletionNotificationCommand that can be deliver first than the CommitCommand. The CommitCommand fails silently:
> {code}
> if (transaction == null) {
> if (trace) log.tracef("Did not find a RemoteTransaction for %s", globalTx);
> return invalidRemoteTxReturnValue();
> }
> }
> {code}
> This bug affects the 5.3 and 5.2.5. I've made one test case to catch this bug:
> 5.3 => https://github.com/pruivo/infinispan/blob/rec-async/core/src/test/java/or...
> 5.2 => https://github.com/pruivo/infinispan/blob/rec-async-5.2/core/src/test/jav...
> Note: this bug may happen if you use async communication (prepare in 1PC)
> Note2: this may be related to https://issues.jboss.org/browse/ISPN-2719
> Possible solutions:
> * do not allow to configure the cache with syncCommitPhase=false && recovery enabled;
> * force syncCommitPhase=true when recovery is enabled;
> * send the CommitCommand and the TxCompletionNotificationCommand as Regular Messages (they will be deliver in FIFO order)
> Thanks to Diego Didona that spotted this bug.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 8 months
[JBoss JIRA] (ISPN-3063) Data Inconsistency when Recovery + syncCommitPhase=false
by Pedro Ruivo (JIRA)
Pedro Ruivo created ISPN-3063:
---------------------------------
Summary: Data Inconsistency when Recovery + syncCommitPhase=false
Key: ISPN-3063
URL: https://issues.jboss.org/browse/ISPN-3063
Project: Infinispan
Issue Type: Bug
Components: Transactions
Affects Versions: 5.3.0.Alpha1, 5.2.5.Final
Reporter: Pedro Ruivo
Assignee: Mircea Markus
with syncCommitPhase=false, the CommitCommand is sent asynchronously. the TransactionCoordinator sends immediately the TxCompletionNotificationCommand that can be deliver first than the CommitCommand. The CommitCommand fails silently:
{code}
if (transaction == null) {
if (trace) log.tracef("Did not find a RemoteTransaction for %s", globalTx);
return invalidRemoteTxReturnValue();
}
}
{code}
This bug affects the 5.3 and 5.2.5. I've made one test case to catch this bug:
5.3 => https://github.com/pruivo/infinispan/blob/rec-async/core/src/test/java/or...
5.2 => https://github.com/pruivo/infinispan/blob/rec-async-5.2/core/src/test/jav...
Note: this bug may happen if you use async communication (prepare in 1PC)
Note2: this may be related to https://issues.jboss.org/browse/ISPN-2719
Possible solutions:
* do not allow to configure the cache with syncCommitPhase=false && recovery enabled;
* force syncCommitPhase=true when recovery is enabled;
* send the CommitCommand and the TxCompletionNotificationCommand as Regular Messages (they will be deliver in FIFO order)
Thanks to Diego Didona that spotted this bug.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 8 months