[JBoss JIRA] (ISPN-11717) Deprecate ConsistentHashFactory customization
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-11717?page=com.atlassian.jira.plugi... ]
Tristan Tarrant updated ISPN-11717:
-----------------------------------
Sprint: DataGrid Sprint #43, DataGrid Sprint #44, DataGrid Sprint #45 (was: DataGrid Sprint #43, DataGrid Sprint #44)
> Deprecate ConsistentHashFactory customization
> ---------------------------------------------
>
> Key: ISPN-11717
> URL: https://issues.redhat.com/browse/ISPN-11717
> Project: Infinispan
> Issue Type: Task
> Components: Configuration, Core
> Affects Versions: 11.0.0.Dev04
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 11.0.0.Final
>
>
> There aren't any good reasons to use a {{ConsistentHashFactory}} implementation different than the default selected by {{StateTransferManagerImpl#pickConsistentHashFactory}}.
> The configuration attribute made sense when the default was {{DefaultConsistentHashFactory}}, but {{SyncConsistentHashFactory}} is much better nowadays, and it's pretty much impossible to come up with an implementation that works in more than one cache mode with the current API.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 7 months
[JBoss JIRA] (ISPN-5938) ClusterListenerReplInitialStateTest.testPrimaryOwnerGoesDownAfterBackupRaisesEvent fails randomly
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-5938?page=com.atlassian.jira.plugin... ]
Tristan Tarrant updated ISPN-5938:
----------------------------------
Fix Version/s: 11.0.0.Final
(was: 11.0.0.CR1)
> ClusterListenerReplInitialStateTest.testPrimaryOwnerGoesDownAfterBackupRaisesEvent fails randomly
> -------------------------------------------------------------------------------------------------
>
> Key: ISPN-5938
> URL: https://issues.redhat.com/browse/ISPN-5938
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite
> Affects Versions: 11.0.0.Alpha1
> Reporter: Roman Macor
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 11.0.0.Final
>
>
> ClusterListenerReplInitialStateTest.testPrimaryOwnerGoesDownAfterBackupRaisesEvent fails randomly with:
> Stacktrace
> java.util.concurrent.TimeoutException
> at java.util.concurrent.FutureTask.get(FutureTask.java:205)
> at org.infinispan.notifications.cachelistener.cluster.ClusterListenerReplTest.testPrimaryOwnerGoesDownAfterBackupRaisesEvent(ClusterListenerReplTest.java:123)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
> at org.testng.internal.Invoker.invokeMethod(Invoker.java:714)
> at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
> at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
> at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
> at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
> at org.testng.TestRunner.privateRun(TestRunner.java:767)
> at org.testng.TestRunner.run(TestRunner.java:617)
> at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
> at org.testng.SuiteRunner.access$000(SuiteRunner.java:37)
> at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:368)
> at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 7 months
[JBoss JIRA] (ISPN-11519) Cache should not start if it cluster listener replication fails
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-11519?page=com.atlassian.jira.plugi... ]
Tristan Tarrant updated ISPN-11519:
-----------------------------------
Fix Version/s: 11.0.0.Final
(was: 11.0.0.CR1)
> Cache should not start if it cluster listener replication fails
> ---------------------------------------------------------------
>
> Key: ISPN-11519
> URL: https://issues.redhat.com/browse/ISPN-11519
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 10.1.5.Final, 11.0.0.Dev03
> Reporter: Dan Berindei
> Assignee: Will Burns
> Priority: Major
> Fix For: 11.0.0.Final
>
>
> {{StateConsumerImpl.fetchClusterListeners}} catches any exceptions during the fetch and local installation of cluster listeners from other nodes, and only logs a warning message:
> {noformat}
> 18:04:14,069 WARN (jgroups-5,Test-NodeD:[]) [StateConsumerImpl] ISPN000284: Problem encountered while installing cluster listener
> {noformat}
> If a cache starts without installing all the cluster listeners locally, the listeners will miss events for keys that end up with the joiner as the primary owner, which would be pretty hard to debug. We should instea fail fast, and prevent the cache from starting if the cluster listeners cannot be fetched and installed locally.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 7 months
[JBoss JIRA] (ISPN-11605) Remove TopologyUpdateStableCommand
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-11605?page=com.atlassian.jira.plugi... ]
Tristan Tarrant updated ISPN-11605:
-----------------------------------
Fix Version/s: 11.0.0.Final
(was: 11.0.0.CR1)
> Remove TopologyUpdateStableCommand
> ----------------------------------
>
> Key: ISPN-11605
> URL: https://issues.redhat.com/browse/ISPN-11605
> Project: Infinispan
> Issue Type: Task
> Components: Core
> Affects Versions: 11.0.0.Dev03
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
> Fix For: 11.0.0.Final
>
>
> Having a separate command for stable topologies means nodes may receive a topology update that should be stable without also receiving the confirmation that the topology is stable, complicating cluster recovery.
> As an example, when the coordinator node shuts down, it first installs a new cache topology without itself for all of the running caches.
> If the number of remaining owners is {{< numOwners}}, no rebalance is needed, and the old coordinator immediately sends a stable topology update as well.
> But any (stable) topology updates from the old coordinator not yet processed by the time of the {{CacheStatusRequestCommand}} from the new coordinator will be ignored.
> If the cluster had only 2 nodes, as in {{StatefulSetRollingUpgradeTest}} and {{StatefulSetRollingUpgradeIT}}, the stable topology update is vital to keep the cache in {{AVAILABLE}} mode, otherwise it goes into {{DEGRADED}} mode and no new nodes can join.
> {noformat}
> 19:48:14,671 TRACE (jgroups-4,Test-NodeE:[]) [JGroupsTransport] Test-NodeE received command from Test-NodeD: ConsistentHashUpdateCommand{cacheName='testCache', origin=null, currentCH=DefaultConsistentHash{ns=256, owners = (1)[Test-NodeE: 256+0]}, pendingCH=null, phase=NO_REBALANCE, actualMembers=[Test-NodeE], persistentUUIDs=[72b17309-62d1-4928-abf6-88a8606ef342], availabilityMode=AVAILABLE, rebalanceId=5, topologyId=27, viewId=7}
> 19:48:14,672 TRACE (jgroups-5,Test-NodeE:[]) [JGroupsTransport] Test-NodeE received command from Test-NodeD: StableTopologyUpdateCommand{cacheName='testCache', origin=null, currentCH=DefaultConsistentHash{ns=256, owners = (1)[Test-NodeE: 256+0]}, pendingCH=null, actualMembers=[Test-NodeE], persistentUUIDs=[72b17309-62d1-4928-abf6-88a8606ef342], rebalanceId=5, topologyId=27, viewId=7}
> 19:48:14,675 TRACE (non-blocking-thread-Test-NodeE-p9541-t5:[]) [LocalTopologyManagerImpl] Acquired cache status testCache
> 19:48:14,675 DEBUG (non-blocking-thread-Test-NodeE-p9541-t5:[]) [LocalTopologyManagerImpl] Updating local topology for cache testCache: CacheTopology{id=27, phase=NO_REBALANCE, rebalanceId=5, currentCH=DefaultConsistentHash{ns=256, owners = (1)[Test-NodeE: 256+0]}, pendingCH=null, unionCH=null, actualMembers=[Test-NodeE], persistentUUIDs=[72b17309-62d1-4928-abf6-88a8606ef342]}
> 19:48:14,686 INFO (jgroups-5,Test-NodeE:[]) [CLUSTER] ISPN000094: Received new cluster view for channel org.infinispan.statetransfer.Test: [Test-NodeE|8] (1) [Test-NodeE]
> 19:48:14,690 TRACE (non-blocking-thread-Test-NodeE-p9541-t5:[]) [LocalTopologyManagerImpl] Released cache status testCache
> 19:48:14,690 TRACE (non-blocking-thread-Test-NodeE-p9541-t5:[Merge-8]) [PreferConsistencyStrategy] Max stable partition topology: CacheTopology{id=25, phase=NO_REBALANCE, rebalanceId=5, currentCH=DefaultConsistentHash{ns=256, owners = (2)[Test-NodeD: 129+127, Test-NodeE: 127+129]}, pendingCH=null, unionCH=null, actualMembers=[Test-NodeD, Test-NodeE], persistentUUIDs=[099401f9-c0b6-460d-8dc8-5051699e8287, 72b17309-62d1-4928-abf6-88a8606ef342]}
> 19:48:14,690 TRACE (non-blocking-thread-Test-NodeE-p9541-t5:[Merge-8]) [PreferConsistencyStrategy] Max active partition topology: CacheTopology{id=27, phase=NO_REBALANCE, rebalanceId=5, currentCH=DefaultConsistentHash{ns=256, owners = (1)[Test-NodeE: 256+0]}, pendingCH=null, unionCH=null, actualMembers=[Test-NodeE], persistentUUIDs=[72b17309-62d1-4928-abf6-88a8606ef342]}
> 19:48:14,690 TRACE (non-blocking-thread-Test-NodeE-p9541-t3:[]) [LocalTopologyManagerImpl] Acquired cache status testCache
> 19:48:14,690 DEBUG (non-blocking-thread-Test-NodeE-p9541-t3:[]) [LocalTopologyManagerImpl] Ignoring topology 27 for cache testCache from old coordinator Test-NodeD
> 19:48:14,690 TRACE (non-blocking-thread-Test-NodeE-p9541-t3:[]) [LocalTopologyManagerImpl] Released cache status testCache
> 19:48:14,704 TRACE (non-blocking-thread-Test-NodeE-p9541-t5:[Merge-8]) [ClusterCacheStatus] Cache testCache availability changed: AVAILABLE -> DEGRADED_MODE
> 19:48:25,194 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.statetransfer.StatefulSetRollingUpgradeTest.testStateTransferRestart[nodes=2]
> org.infinispan.commons.CacheException: Initial state transfer timed out for cache testCache on Test-NodeF
> at org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete(StateTransferManagerImpl.java:240) ~[classes/:?]
> at org.infinispan.cache.impl.CacheImpl.start(CacheImpl.java:1049) ~[classes/:?]
> at org.infinispan.cache.impl.AbstractDelegatingCache.start(AbstractDelegatingCache.java:513) ~[classes/:?]
> at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:692) ~[classes/:?]
> at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:631) ~[classes/:?]
> at org.infinispan.manager.DefaultCacheManager.internalGetCache(DefaultCacheManager.java:516) ~[classes/:?]
> at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:497) ~[classes/:?]
> at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:490) ~[classes/:?]
> at org.infinispan.test.MultipleCacheManagersTest.getCaches(MultipleCacheManagersTest.java:322) ~[test-classes/:?]
> at org.infinispan.test.MultipleCacheManagersTest.waitForClusterToForm(MultipleCacheManagersTest.java:329) ~[test-classes/:?]
> at org.infinispan.statetransfer.StatefulSetRollingUpgradeTest.testStateTransferRestart(StatefulSetRollingUpgradeTest.java:78) ~[test-classes/:?]
> {noformat}
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 7 months
[JBoss JIRA] (ISPN-11248) Add test for ServerTask that utilizes protostream stored entries
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-11248?page=com.atlassian.jira.plugi... ]
Tristan Tarrant updated ISPN-11248:
-----------------------------------
Fix Version/s: 11.0.0.Final
(was: 11.0.0.CR1)
> Add test for ServerTask that utilizes protostream stored entries
> ----------------------------------------------------------------
>
> Key: ISPN-11248
> URL: https://issues.redhat.com/browse/ISPN-11248
> Project: Infinispan
> Issue Type: Task
> Reporter: Will Burns
> Priority: Major
> Fix For: 11.0.0.Final
>
>
> We currently have a single ServerTask that just prints out "Hello <name>". We should add a new test that actually tests a very likely use case of storing entries in protostream and using a task that deserializes them properly (this could be done automatically by the encoding layer of the cache, if it is working).
> We may also want to look into sharing this with a custom loader, as we have users doing this today. And it is quite clunky, so we can see how the usability is in 11.0.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 7 months
[JBoss JIRA] (ISPN-11567) Scattered caches with a single node expire entries immediately
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-11567?page=com.atlassian.jira.plugi... ]
Tristan Tarrant updated ISPN-11567:
-----------------------------------
Fix Version/s: 11.0.0.Final
(was: 11.0.0.CR1)
> Scattered caches with a single node expire entries immediately
> --------------------------------------------------------------
>
> Key: ISPN-11567
> URL: https://issues.redhat.com/browse/ISPN-11567
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 10.1.5.Final, 11.0.0.Dev03
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 11.0.0.Final
>
>
> {{ClusterExpirationManager.checkExpiredMaxIdle()}} sends a {{TouchCommand}} to the other owners and expires the entry locally if the touch was unsuccessful.
> {{ScatteredTouchResponseCollector}} is stateless, and reports that the entry has not been touched if it doesn't receive any {{true}} response. However, this is only correct if at least one backup entry existing on another node, and that is not the always the case: e.g. between the backup node leaving the cluster and another node becoming a backup, or when the cluster has a single node.
> Since {{ClusterExpirationManager.checkExpiredMaxIdle()}} is called on every read, before the entry being expired on the local node, it means a transient entry in a scattered cache with a single node will expire on the first read, immediately after being inserted.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 7 months