[JBoss JIRA] (ISPN-5021) Nodes that finish the rebalance later can see outdated values
by Adrian Nistor (JIRA)
[ https://issues.jboss.org/browse/ISPN-5021?page=com.atlassian.jira.plugin.... ]
Adrian Nistor updated ISPN-5021:
--------------------------------
Fix Version/s: 7.2.0.Beta1
(was: 7.2.0.Alpha1)
> Nodes that finish the rebalance later can see outdated values
> -------------------------------------------------------------
>
> Key: ISPN-5021
> URL: https://issues.jboss.org/browse/ISPN-5021
> Project: Infinispan
> Issue Type: Bug
> Components: Core, State Transfer
> Affects Versions: 7.0.2.Final
> Reporter: Dan Berindei
> Assignee: Pedro Ruivo
> Priority: Critical
> Fix For: 7.2.0.Beta1
>
>
> Copied from [ISPN-4444|https://issues.jboss.org/browse/ISPN-4444?focusedCommentId=1302...]
> If the CH_UPDATE command is delayed on the old owner, the new owners might update the key without the old owner knowing, and a locality check on the old owner won't help.
> I remember one thing that struck me when reading the Raft algorithm was that they install configuration changes symmetrically, in 3 phases. We might need to do the same for our rebalance: start a rebalance with read_ch=old, write_ch=old+new, when the new owners have all the data install read_ch=new, write_ch=old+new, and finally read_ch=new, write_ch=new. Old cache entries are removed during the 2nd topology update, and further writes should be ignored, in order for this to work.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
9 years, 11 months
[JBoss JIRA] (ISPN-5019) After coordinator change, cache topologies should be installed in parallel
by Adrian Nistor (JIRA)
[ https://issues.jboss.org/browse/ISPN-5019?page=com.atlassian.jira.plugin.... ]
Adrian Nistor updated ISPN-5019:
--------------------------------
Fix Version/s: 7.2.0.Beta1
(was: 7.2.0.Alpha1)
> After coordinator change, cache topologies should be installed in parallel
> --------------------------------------------------------------------------
>
> Key: ISPN-5019
> URL: https://issues.jboss.org/browse/ISPN-5019
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 7.0.2.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 7.2.0.Beta1
>
>
> When the coordinator crashes, the new coordinator has to recover the cache topologies from all the nodes in the cluster and install updated topologies for all the caches. This is done on a single thread, and it can take a long time when there are a lot of caches.
> We should be accelerate this by doing the topology installation on separate threads. However, we have to be careful with the async transport pool, because {{executeOnClusterAsync}} actually needs to spawn a new thread in the same pool.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
9 years, 11 months
[JBoss JIRA] (ISPN-5042) Remote gets caused by writes could be replicated only to the primary owner
by Adrian Nistor (JIRA)
[ https://issues.jboss.org/browse/ISPN-5042?page=com.atlassian.jira.plugin.... ]
Adrian Nistor updated ISPN-5042:
--------------------------------
Fix Version/s: 7.2.0.Beta1
(was: 7.2.0.Alpha1)
> Remote gets caused by writes could be replicated only to the primary owner
> --------------------------------------------------------------------------
>
> Key: ISPN-5042
> URL: https://issues.jboss.org/browse/ISPN-5042
> Project: Infinispan
> Issue Type: Enhancement
> Components: Core, State Transfer
> Affects Versions: 7.1.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Minor
> Labels: 7.0
> Fix For: 7.2.0.Beta1
>
>
> For write operations that need the previous value, a write CH-only owner that doesn't have a key locally will attempt to retrieve the key from the read CH-owners.
> Sending the remote get command to all the previous owners will create extra load on the cluster during state transfer, so it should be more efficient to send the remote get only to the primary owner. Even though the latency of some write operations will be higher, the average latency should be better.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
9 years, 11 months
[JBoss JIRA] (ISPN-5073) Improve "Number of Entries" stats
by Adrian Nistor (JIRA)
[ https://issues.jboss.org/browse/ISPN-5073?page=com.atlassian.jira.plugin.... ]
Adrian Nistor updated ISPN-5073:
--------------------------------
Fix Version/s: 7.2.0.Beta1
(was: 7.2.0.Alpha1)
> Improve "Number of Entries" stats
> ---------------------------------
>
> Key: ISPN-5073
> URL: https://issues.jboss.org/browse/ISPN-5073
> Project: Infinispan
> Issue Type: Task
> Components: Core, JMX, reporting and management
> Affects Versions: 7.1.0.Alpha1
> Reporter: Tristan Tarrant
> Assignee: Vladimir Blagojevic
> Fix For: 7.2.0.Beta1
>
>
> Currently the getNumberOfEntries in CacheMgmtInterceptor returns the size of the datacontainer which doesn't take into account expired entries and cache stores.
> To avoid compatibility issues, modify the description to reflect its behaviour and add proper statistics, possibly with different flag combinations (SKIP_REMOTE, SKIP_CACHE_LOAD)
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
9 years, 11 months
[JBoss JIRA] (ISPN-5046) PartitionHandling: split during commit can leave the cache inconsistent after merge
by Adrian Nistor (JIRA)
[ https://issues.jboss.org/browse/ISPN-5046?page=com.atlassian.jira.plugin.... ]
Adrian Nistor updated ISPN-5046:
--------------------------------
Fix Version/s: 7.2.0.Beta1
(was: 7.2.0.Alpha1)
> PartitionHandling: split during commit can leave the cache inconsistent after merge
> -----------------------------------------------------------------------------------
>
> Key: ISPN-5046
> URL: https://issues.jboss.org/browse/ISPN-5046
> Project: Infinispan
> Issue Type: Bug
> Components: Core, State Transfer
> Affects Versions: 7.0.2.Final, 7.1.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 7.2.0.Beta1
>
>
> Say we have a cluster ABCD; a transaction T was started on A, with B as the primary owner and C the backup owner. B and C both acknowledge the prepare, and the network splits into AB and CD right before A sends the commit command. Eventually A suspects C and D, but the commit still succeeds on B before C and D are suspected. And SuspectExceptions are ignored for commit commands, so the user won't see any error.
> However, C will eventually suspect A and B. When the CD cache topology is installed, it will roll back transaction T. After the merge, both partitions are in degraded mode, so we assume that they both have the latest data and the key is never updated on C.
> From C's point of view, this is very similar to ISPN-3421. The fix should also be similar, we could delay the transaction rollback on C until we get a confirmation from B that T was not committed there. Since B is inaccessible, it will eventually get a SuspectException and the CD cache topology, at which point the cache is in degraded mode and it can wait for a merge. On merge, it should check the status of the transaction on B again, and either commit or rollback based on what B did.
> We also need to suspend the cleanup of completed transactions while the cache is in degraded mode, otherwise C might not find T on B after the merge.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
9 years, 11 months
[JBoss JIRA] (ISPN-5076) Pessimistic transactions can lose their locks when the primary owner changes
by Adrian Nistor (JIRA)
[ https://issues.jboss.org/browse/ISPN-5076?page=com.atlassian.jira.plugin.... ]
Adrian Nistor updated ISPN-5076:
--------------------------------
Fix Version/s: 7.2.0.Beta1
(was: 7.2.0.Alpha1)
> Pessimistic transactions can lose their locks when the primary owner changes
> ----------------------------------------------------------------------------
>
> Key: ISPN-5076
> URL: https://issues.jboss.org/browse/ISPN-5076
> Project: Infinispan
> Issue Type: Bug
> Components: Core, State Transfer
> Affects Versions: 7.0.2.Final, 7.1.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Labels: 7.0
> Fix For: 7.2.0.Beta1
>
>
> In a pessimistic cache, if a transaction {{T1}} has a {{put(k, v)}} operation and the primary owner of the key is the originator, the lock is acquired on the originator but it is not replicated to on the backup(s).
> If one of the backup owners becomes the primary owner, it will allow another transaction {{T2}} to lock (and update) key {{k}} before it receives the one-phase prepare command from the originator of {{T1}}.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
9 years, 11 months
[JBoss JIRA] (ISPN-5127) LocalEntryRetrieverWithStoreAsBinaryTest.testFilterWithStoreAsBinaryPartialKeys random failures
by Adrian Nistor (JIRA)
[ https://issues.jboss.org/browse/ISPN-5127?page=com.atlassian.jira.plugin.... ]
Adrian Nistor updated ISPN-5127:
--------------------------------
Fix Version/s: 7.2.0.Beta1
(was: 7.2.0.Alpha1)
> LocalEntryRetrieverWithStoreAsBinaryTest.testFilterWithStoreAsBinaryPartialKeys random failures
> -----------------------------------------------------------------------------------------------
>
> Key: ISPN-5127
> URL: https://issues.jboss.org/browse/ISPN-5127
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 7.1.0.Alpha1, 7.0.3.Final
> Reporter: Dan Berindei
> Assignee: William Burns
> Priority: Blocker
> Labels: testsuite_stability
> Fix For: 7.2.0.Beta1
>
>
> Sometimes the filtered retriever doesn't return any entries:
> {noformat}
> 15:16:26,328 ERROR (testng-LocalEntryRetrieverWithStoreAsBinaryTest:) [UnitTestTestNGListener] Test testFilterWithStoreAsBinaryPartialKeys(org.infinispan.iteration.LocalEntryRetrieverWithStoreAsBinaryTest) failed.java.util.NoSuchElementException
> at org.infinispan.iteration.impl.LocalEntryRetriever$Itr.next(LocalEntryRetriever.java:486)
> at org.infinispan.iteration.impl.LocalEntryRetriever$Itr.next(LocalEntryRetriever.java:428)
> at org.infinispan.iteration.LocalEntryRetrieverWithStoreAsBinaryTest.testFilterWithStoreAsBinaryPartialKeys(LocalEntryRetrieverWithStoreAsBinaryTest.java:93)
> {noformat}
> http://ci.infinispan.org/viewLog.html?buildId=14964
> The test should also use custom key/value types, as {{String}} keys/values are not marshalled when {{storeAsBinary}} is enabled (see {{MarshalledValue.isTypeExcluded()}}).
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
9 years, 11 months