[JBoss JIRA] (ISPN-2510) PrepareCommands should fail on nodes where the cache is not running
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-2510?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-2510:
--------------------------------
Fix Version/s: 6.0.0.Final
(was: 5.3.0.Final)
> PrepareCommands should fail on nodes where the cache is not running
> -------------------------------------------------------------------
>
> Key: ISPN-2510
> URL: https://issues.jboss.org/browse/ISPN-2510
> Project: Infinispan
> Issue Type: Bug
> Components: Distributed Cache, RPC
> Affects Versions: 5.2.0.Beta3
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 6.0.0.Final
>
>
> When the user stops a cache without stopping the cache manager on that node, subsequent PrepareCommands sent to that node will return a {{SuccessfulResponse}}.
> If that node used to the primary owner of the command's modified key, the originator will proceed with the transaction as if it had acquired a lock on that key. It is thus possible for multiple transactions to think they have acquired the key lock at the same time.
> On the other hand, in replicated caches is is quite possible that a cache is not running on all the cluster node and yet PrepareCommands are broadcasted to everyone in parallel. So the solution should not involve sending exceptions (which have huge stack traces), and the originator should be able to ignore failures responses from nodes that were not targeted in the first place.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 7 months
[JBoss JIRA] (ISPN-3035) Members can re-appear by itself in the consistent hash after leaving
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3035?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-3035:
--------------------------------
Fix Version/s: 6.0.0.Final
(was: 5.3.0.Final)
> Members can re-appear by itself in the consistent hash after leaving
> --------------------------------------------------------------------
>
> Key: ISPN-3035
> URL: https://issues.jboss.org/browse/ISPN-3035
> Project: Infinispan
> Issue Type: Bug
> Components: State transfer
> Affects Versions: 5.2.5.Final, 5.3.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 6.0.0.Final
>
> Attachments: dret.log, dret2.log
>
>
> Seen as an intermittent failure in DataRehashedEventTest:
> {noformat}
> 2013-04-23 14:07:45,459 DEBUG (testng-DataRehashedEventTest) [org.infinispan.manager.DefaultCacheManager] Stopping cache manager ISPN on NodeC-58711
> 2013-04-23 14:07:45,468 INFO (testng-DataRehashedEventTest) [org.infinispan.remoting.transport.jgroups.JGroupsTransport] ISPN000080: Disconnecting and closing JGroups Channel
> 2013-04-23 14:07:46,469 DEBUG (testng-DataRehashedEventTest) [org.jgroups.protocols.pbcast.GMS] NodeC-58711: sending LEAVE request to NodeA-28008
> 2013-04-23 14:07:46,489 DEBUG (Incoming-2,ISPN,NodeA-28008) [org.jgroups.protocols.pbcast.GMS] NodeA-28008: installing [NodeA-28008|4] [NodeA-28008, NodeB-46156, NodeC-58711]
> 2013-04-23 14:07:46,491 DEBUG (asyncTransportThread-0,NodeA) [org.infinispan.topology.ClusterTopologyManagerImpl] Starting cluster-wide rebalance for cache ___defaultcache, topology = CacheTopology{id=8, currentCH=DefaultConsistentHash{numSegments=60, numOwners=2, members=[NodeA-28008, NodeB-46156]}, pendingCH=DefaultConsistentHash{numSegments=60, numOwners=2, members=[NodeA-28008, NodeB-46156, NodeC-58711]}}
> 2013-04-23 14:07:49,493 ERROR (testng-DataRehashedEventTest) [org.infinispan.test.fwk.UnitTestTestNGListener] Test testJoinAndLeave(org.infinispan.statetransfer.DataRehashedEventTest) failed.
> java.lang.AssertionError: expected [2] but found [6]
> at org.testng.Assert.fail(Assert.java:94)
> at org.testng.Assert.failNotEquals(Assert.java:494)
> at org.testng.Assert.assertEquals(Assert.java:123)
> at org.testng.Assert.assertEquals(Assert.java:370)
> at org.testng.Assert.assertEquals(Assert.java:380)
> at org.infinispan.statetransfer.DataRehashedEventTest.testJoinAndLeave(DataRehashedEventTest.java:114)
> {noformat}
> The initial cluster has 3 nodes: A, B, C. C is killed, but somehow remains in the ClusterCacheStatus on the coordinator.
> Then C re-appears in the JGroups view (possibly a JGroups issue). The problem in Infinispan is that the coordinator now sees C as a joiner, and it rebalances the cache to include C in the consistent hash again.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 7 months
[JBoss JIRA] (ISPN-3192) Concurrent TreeCache.move() calls with the same destination lose data
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3192?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-3192:
--------------------------------
Labels: (was: testsuite_stability)
> Concurrent TreeCache.move() calls with the same destination lose data
> ---------------------------------------------------------------------
>
> Key: ISPN-3192
> URL: https://issues.jboss.org/browse/ISPN-3192
> Project: Infinispan
> Issue Type: Bug
> Components: Tree API
> Affects Versions: 5.3.0.CR1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 6.0.0.Final
>
>
> The move method reads the contents of the destination node in the transaction/invocation context before locking it.
> If there are multiple parallel movers with the same destination, some of the moved nodes might be lost. This sometimes happens in NodeMoveAPIPessimisticTest, causing random failures.
> Note that even if the move() method locks the destination node, it will still be possible for the user to read the destination node in the same transaction and cause data loss. The move() method documentation should warn about this.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 7 months
[JBoss JIRA] (ISPN-3184) The DELTA_WRITE flag should force a remote get during state transfer
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3184?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-3184:
--------------------------------
Fix Version/s: 6.0.0.Final
> The DELTA_WRITE flag should force a remote get during state transfer
> --------------------------------------------------------------------
>
> Key: ISPN-3184
> URL: https://issues.jboss.org/browse/ISPN-3184
> Project: Infinispan
> Issue Type: Bug
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 6.0.0.Final
>
>
> AtomicHashMap and FineGrainedAtomicHashMap, as well as custom DeltaAware implementations, use PutKeyValueCommands with the DELTA_WRITE flag to execute incremental updates. These commands need the previous value of the entry in order to work.
> If a node is joining and it receives a PutKeyValueCommand with the DELTA_WRITE flag before it has received the value of the affected key, it should do a remote get to retrieve the previous value and apply the change on top of that value, just like we do for conditional commands. Not doing so leads to data loss.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 7 months
[JBoss JIRA] (ISPN-3192) Concurrent TreeCache.move() calls with the same destination lose data
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3192?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-3192:
--------------------------------
Fix Version/s: 6.0.0.Final
(was: 5.3.0.Final)
> Concurrent TreeCache.move() calls with the same destination lose data
> ---------------------------------------------------------------------
>
> Key: ISPN-3192
> URL: https://issues.jboss.org/browse/ISPN-3192
> Project: Infinispan
> Issue Type: Bug
> Components: Tree API
> Affects Versions: 5.3.0.CR1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Labels: testsuite_stability
> Fix For: 6.0.0.Final
>
>
> The move method reads the contents of the destination node in the transaction/invocation context before locking it.
> If there are multiple parallel movers with the same destination, some of the moved nodes might be lost. This sometimes happens in NodeMoveAPIPessimisticTest, causing random failures.
> Note that even if the move() method locks the destination node, it will still be possible for the user to read the destination node in the same transaction and cause data loss. The move() method documentation should warn about this.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 7 months
[JBoss JIRA] (ISPN-2913) putForExternalRead leaves locks
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-2913?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-2913:
--------------------------------
Assignee: Pedro Ruivo (was: Adrian Nistor)
> putForExternalRead leaves locks
> -------------------------------
>
> Key: ISPN-2913
> URL: https://issues.jboss.org/browse/ISPN-2913
> Project: Infinispan
> Issue Type: Bug
> Components: Locking and Concurrency
> Affects Versions: 5.2.1.Final
> Reporter: Sebastian Tusk
> Assignee: Pedro Ruivo
> Fix For: 6.0.0.Final
>
> Attachments: SebastianTusk_ISPN-2913.patch
>
>
> In TxDistributionInterceptor.remoteGetAndStoreInL1 locks are acquired. Without a transaction these locks are never released. The cache setup is Dist, Async, L1, 2 Nodes, 1 Owner, Optimistic Locking.
> In AbstractTxLockingInterceptor.visitGetKeyValueCommand locks are released explicitly if outside of transactions. I fixed this problem by doing the same in OptimisticLockingInterceptor.visitPutKeyValueCommand. It is very likely that this doesn't fix all problems. For instance OptimisticLockingInterceptor.visitPutMapCommand or PessimisticLockingInterceptor.
> Cache Config:
> <namedCache name="entity">
> <jmxStatistics enabled="true" />
>
> <clustering mode="dist">
> <stateTransfer fetchInMemoryState="false" timeout="20000" />
> <async />
> <l1 enabled="true" />
> <hash numOwners="1"/>
> </clustering>
> <locking isolationLevel="READ_COMMITTED"
> lockAcquisitionTimeout="15000" useLockStriping="false" />
>
> <eviction maxEntries="10000" strategy="LRU" />
> <expiration maxIdle="100000" wakeUpInterval="5000"/>
> <storeAsBinary storeKeysAsBinary="true" storeValuesAsBinary="false" enabled="false" />
>
> <transaction transactionMode="TRANSACTIONAL" autoCommit="false" lockingMode="OPTIMISTIC"/>
> </namedCache>
> Fixed OptimisticLockingInterceptor.visitPutKeyValueCommand:
> @Override
> public Object visitPutKeyValueCommand(InvocationContext ctx, PutKeyValueCommand command) throws Throwable {
> try {
> if (command.isConditional()) markKeyAsRead(ctx, command);
> return invokeNextInterceptor(ctx, command);
> } catch (Throwable te) {
> throw cleanLocksAndRethrow(ctx, te);
> } finally {
> //with putForExternalRead the value might be put into L1 without a transaction
> //we need to release any locks for these cases
> if (!ctx.isInTxScope()) lockManager.unlockAll(ctx);
> }
> }
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 7 months