[JBoss JIRA] (ISPN-5883) Node can apply new topology after sending status response
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-5883?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-5883:
-------------------------------
Status: Pull Request Sent (was: Reopened)
Git Pull Request: https://github.com/infinispan/infinispan/pull/3782, https://github.com/infinispan/infinispan/pull/3877, https://github.com/infinispan/infinispan/pull/4321 (was: https://github.com/infinispan/infinispan/pull/3782, https://github.com/infinispan/infinispan/pull/3877)
> Node can apply new topology after sending status response
> ---------------------------------------------------------
>
> Key: ISPN-5883
> URL: https://issues.jboss.org/browse/ISPN-5883
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 8.0.1.Final, 7.2.5.Final, 8.1.0.Alpha2
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Labels: testsuite_stability
> Fix For: 8.2.0.Beta1, 8.1.4.Final
>
>
> {{LocalTopologyManagerImpl}} is responsible for sending the {{ClusterTopologyControlCommand(GET_STATUS)}} response, and when it sends the response it doesn't check the current view id against the new coordinator's view id. If the old coordinator already sent a topology update before the merge, that topology update might be processed after sending the status response. The new coordinator will send a topology update with a topology id of {{max(status response topology ids) + 1}}. The node will then process the topology update from the old coordinator, but it will ignore the topology update from the new coordinator with the same topology id.
> This is extra common in the partition handling tests, e.g. {{BasePessimisticTxPartitionAndMergeTest}} subclasses, because the test "injects" the JGroups view on each node serially, and often the 4th node sends the status response before it gets the new view.
> {noformat}
> 22:16:37,776 DEBUG (remote-thread-NodeD-p26-t6:[]) [LocalTopologyManagerImpl] Sending cluster status response for view 10
> // Topology from NodeC
> 22:16:37,778 DEBUG (transport-thread-NodeD-p28-t2:[]) [LocalTopologyManagerImpl] Updating local topology for cache pes-cache: CacheTopology{id=8, rebalanceId=3, currentCH=DefaultConsistentHash{ns=60, owners = (4)[NodeA-37631: 15+15, NodeB-47846: 15+15, NodeC-46467: 15+15, NodeD-30486: 15+15]}, pendingCH=null, unionCH=null, actualMembers=[NodeC-46467, NodeD-30486]}
> // Later, topology from NodeA
> 22:16:37,827 DEBUG (transport-thread-NodeD-p28-t1:[]) [LocalTopologyManagerImpl] Ignoring late consistent hash update for cache pes-cache, current topology is 8: CacheTopology{id=8, rebalanceId=3, currentCH=DefaultConsistentHash{ns=60, owners = (4)[NodeA-37631: 15+15, NodeB-47846: 15+15, NodeC-46467: 15+15, NodeD-30486: 15+15]}, pendingCH=null, unionCH=null, actualMembers=[NodeA-37631, NodeB-47846, NodeC-46467, NodeD-30486]}
> {noformat}
> As a solution, we can delay sending the status response until we have the same view as the coordinator (or a later one). We already check that the sender is the current coordinator before applying a topology update, so this will guarantee that the we don't apply other topology updates from the old coordinator. Since the status request is only sent after the new view was installed, this will not introduce any delays in the vast majority of cases.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 11 months
[JBoss JIRA] (ISPN-5883) Node can apply new topology after sending status response
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-5883?page=com.atlassian.jira.plugin.... ]
Dan Berindei reopened ISPN-5883:
--------------------------------
The locking used in {{LocalTopologyManagerImpl}} to guarantee that topology updates from the old coordinator are not applied is incorrect, because the view id is checked only *before* applying the topology, and there is no synchronization to prevent a status request from executing between them.
> Node can apply new topology after sending status response
> ---------------------------------------------------------
>
> Key: ISPN-5883
> URL: https://issues.jboss.org/browse/ISPN-5883
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 8.0.1.Final, 7.2.5.Final, 8.1.0.Alpha2
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Labels: testsuite_stability
> Fix For: 8.2.0.Beta1, 8.1.4.Final
>
>
> {{LocalTopologyManagerImpl}} is responsible for sending the {{ClusterTopologyControlCommand(GET_STATUS)}} response, and when it sends the response it doesn't check the current view id against the new coordinator's view id. If the old coordinator already sent a topology update before the merge, that topology update might be processed after sending the status response. The new coordinator will send a topology update with a topology id of {{max(status response topology ids) + 1}}. The node will then process the topology update from the old coordinator, but it will ignore the topology update from the new coordinator with the same topology id.
> This is extra common in the partition handling tests, e.g. {{BasePessimisticTxPartitionAndMergeTest}} subclasses, because the test "injects" the JGroups view on each node serially, and often the 4th node sends the status response before it gets the new view.
> {noformat}
> 22:16:37,776 DEBUG (remote-thread-NodeD-p26-t6:[]) [LocalTopologyManagerImpl] Sending cluster status response for view 10
> // Topology from NodeC
> 22:16:37,778 DEBUG (transport-thread-NodeD-p28-t2:[]) [LocalTopologyManagerImpl] Updating local topology for cache pes-cache: CacheTopology{id=8, rebalanceId=3, currentCH=DefaultConsistentHash{ns=60, owners = (4)[NodeA-37631: 15+15, NodeB-47846: 15+15, NodeC-46467: 15+15, NodeD-30486: 15+15]}, pendingCH=null, unionCH=null, actualMembers=[NodeC-46467, NodeD-30486]}
> // Later, topology from NodeA
> 22:16:37,827 DEBUG (transport-thread-NodeD-p28-t1:[]) [LocalTopologyManagerImpl] Ignoring late consistent hash update for cache pes-cache, current topology is 8: CacheTopology{id=8, rebalanceId=3, currentCH=DefaultConsistentHash{ns=60, owners = (4)[NodeA-37631: 15+15, NodeB-47846: 15+15, NodeC-46467: 15+15, NodeD-30486: 15+15]}, pendingCH=null, unionCH=null, actualMembers=[NodeA-37631, NodeB-47846, NodeC-46467, NodeD-30486]}
> {noformat}
> As a solution, we can delay sending the status response until we have the same view as the coordinator (or a later one). We already check that the sender is the current coordinator before applying a topology update, so this will guarantee that the we don't apply other topology updates from the old coordinator. Since the status request is only sent after the new view was installed, this will not introduce any delays in the vast majority of cases.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 11 months
[JBoss JIRA] (ISPN-5972) Number of entries not working correctly in cache statistics in mgmt console
by Martin Gencur (JIRA)
[ https://issues.jboss.org/browse/ISPN-5972?page=com.atlassian.jira.plugin.... ]
Martin Gencur commented on ISPN-5972:
-------------------------------------
Pedro/Vladimir, is the #Entries statistics supposed to give the number of entries in the cache plus cache stores? I believe it currently shows only those in the cache itself. Is there any option in the console to get the total number (cache + cache stores) ?
> Number of entries not working correctly in cache statistics in mgmt console
> ---------------------------------------------------------------------------
>
> Key: ISPN-5972
> URL: https://issues.jboss.org/browse/ISPN-5972
> Project: Infinispan
> Issue Type: Bug
> Components: Console
> Affects Versions: 8.1.0.Beta1
> Reporter: Jiří Holuša
> Assignee: Pedro Ruivo
> Fix For: 9.0.0.Alpha2
>
>
> Page: Caches -> select cache container -> select cache.
> Configuration of the cache: replicated cache, 2 nodes in the domain
> In the "Cache content" tab, there is a field "# Entries" which should probably show number of entries in the cache. When I put 100 entries in the cache, this field shows 200. Given that it's replicated cache, I think what's happening is that it shows numberOfNodes*numberOfEntries, because when I try to put 100 entries with 3 nodes in the domain, the "# Entries" shows 300.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 11 months
[JBoss JIRA] (ISPN-6577) Indexing properties should be stored in a distinct subresource to ease configuration inheritance
by Adrian Nistor (JIRA)
[ https://issues.jboss.org/browse/ISPN-6577?page=com.atlassian.jira.plugin.... ]
Adrian Nistor updated ISPN-6577:
--------------------------------
Status: Pull Request Sent (was: Open)
Git Pull Request: https://github.com/infinispan/infinispan/pull/4304
> Indexing properties should be stored in a distinct subresource to ease configuration inheritance
> ------------------------------------------------------------------------------------------------
>
> Key: ISPN-6577
> URL: https://issues.jboss.org/browse/ISPN-6577
> Project: Infinispan
> Issue Type: Bug
> Components: Server
> Affects Versions: 8.2.1.Final, 9.0.0.Alpha1
> Reporter: Gustavo Fernandes
> Assignee: Tristan Tarrant
> Priority: Critical
> Fix For: 9.0.0.Alpha2
>
>
> Given the following config:
> {code:xml}
> <replicated-cache-configuration name="indexed-cache" mode="SYNC" start="EAGER" remote-timeout="20000"/>
> <replicated-cache name="booksCache" configuration="indexed-cache">
> <indexing index="LOCAL">
> <property name="default.metadata_cachename">indexMetadataBooksCache</property>
> <property name="default.data_cachename">indexDataBooksCache</property>
> <property name="default.locking_cachename">indexLockingBooksCache</property>
> <property name="default.directory_provider">infinispan</property>
> <property name="default.indexmanager">org.infinispan.query.indexmanager.InfinispanIndexManager</property>
> <property name="lucene_version">LUCENE_CURRENT</property>
> </indexing>
> </replicated-cache>
> {code}
> The booksCache starts in non-indexed mode, even if it explicitly configure the indexing
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 11 months
[JBoss JIRA] (ISPN-6577) Indexing properties should be stored in a distinct subresource to ease configuration inheritance
by Adrian Nistor (JIRA)
[ https://issues.jboss.org/browse/ISPN-6577?page=com.atlassian.jira.plugin.... ]
Adrian Nistor updated ISPN-6577:
--------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
Integrated in master. Thanks [~NadirX]!
> Indexing properties should be stored in a distinct subresource to ease configuration inheritance
> ------------------------------------------------------------------------------------------------
>
> Key: ISPN-6577
> URL: https://issues.jboss.org/browse/ISPN-6577
> Project: Infinispan
> Issue Type: Bug
> Components: Server
> Affects Versions: 8.2.1.Final, 9.0.0.Alpha1
> Reporter: Gustavo Fernandes
> Assignee: Tristan Tarrant
> Priority: Critical
> Fix For: 9.0.0.Alpha2
>
>
> Given the following config:
> {code:xml}
> <replicated-cache-configuration name="indexed-cache" mode="SYNC" start="EAGER" remote-timeout="20000"/>
> <replicated-cache name="booksCache" configuration="indexed-cache">
> <indexing index="LOCAL">
> <property name="default.metadata_cachename">indexMetadataBooksCache</property>
> <property name="default.data_cachename">indexDataBooksCache</property>
> <property name="default.locking_cachename">indexLockingBooksCache</property>
> <property name="default.directory_provider">infinispan</property>
> <property name="default.indexmanager">org.infinispan.query.indexmanager.InfinispanIndexManager</property>
> <property name="lucene_version">LUCENE_CURRENT</property>
> </indexing>
> </replicated-cache>
> {code}
> The booksCache starts in non-indexed mode, even if it explicitly configure the indexing
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 11 months
[JBoss JIRA] (ISPN-6577) Indexing properties should be stored in a distinct subresource to ease configuration inheritance
by Adrian Nistor (JIRA)
[ https://issues.jboss.org/browse/ISPN-6577?page=com.atlassian.jira.plugin.... ]
Adrian Nistor updated ISPN-6577:
--------------------------------
Status: Open (was: New)
> Indexing properties should be stored in a distinct subresource to ease configuration inheritance
> ------------------------------------------------------------------------------------------------
>
> Key: ISPN-6577
> URL: https://issues.jboss.org/browse/ISPN-6577
> Project: Infinispan
> Issue Type: Bug
> Components: Server
> Affects Versions: 8.2.1.Final, 9.0.0.Alpha1
> Reporter: Gustavo Fernandes
> Assignee: Tristan Tarrant
> Priority: Critical
> Fix For: 9.0.0.Alpha2
>
>
> Given the following config:
> {code:xml}
> <replicated-cache-configuration name="indexed-cache" mode="SYNC" start="EAGER" remote-timeout="20000"/>
> <replicated-cache name="booksCache" configuration="indexed-cache">
> <indexing index="LOCAL">
> <property name="default.metadata_cachename">indexMetadataBooksCache</property>
> <property name="default.data_cachename">indexDataBooksCache</property>
> <property name="default.locking_cachename">indexLockingBooksCache</property>
> <property name="default.directory_provider">infinispan</property>
> <property name="default.indexmanager">org.infinispan.query.indexmanager.InfinispanIndexManager</property>
> <property name="lucene_version">LUCENE_CURRENT</property>
> </indexing>
> </replicated-cache>
> {code}
> The booksCache starts in non-indexed mode, even if it explicitly configure the indexing
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 11 months