[JBoss JIRA] (ISPN-11304) Allow scaling up without state transfer
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-11304?page=com.atlassian.jira.plugi... ]
Tristan Tarrant updated ISPN-11304:
-----------------------------------
Fix Version/s: 11.0.0.CR1
(was: 11.0.0.Dev05)
> Allow scaling up without state transfer
> ---------------------------------------
>
> Key: ISPN-11304
> URL: https://issues.redhat.com/browse/ISPN-11304
> Project: Infinispan
> Issue Type: Feature Request
> Components: Core
> Affects Versions: 10.1.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 11.0.0.CR1
>
>
> We should allow a cache to scale up without performing any state transfer, but without losing the data.
> To simplify things, the initial version will support a single owner, and will assume that only one node is being added at a time.
> The cache must be accessible remotely, but since information about the location of the keys is not accessible from the client, the client is expected to ignore ownership and use a round-robin access strategy.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 7 months
[JBoss JIRA] (ISPN-11381) AdvancedCache.getGroup(key) may return incomplete results during state transfer
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-11381?page=com.atlassian.jira.plugi... ]
Tristan Tarrant updated ISPN-11381:
-----------------------------------
Fix Version/s: 11.0.0.CR1
(was: 11.0.0.Dev05)
> AdvancedCache.getGroup(key) may return incomplete results during state transfer
> -------------------------------------------------------------------------------
>
> Key: ISPN-11381
> URL: https://issues.redhat.com/browse/ISPN-11381
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.4.18.Final, 10.1.2.Final, 11.0.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 11.0.0.CR1
>
>
> {{AdvancedCache.getGroup(groupKey)}} returns all the keys that belong to a group.
> If the originator is not an owner, the command is forwarded to the primary owner. If the originator is a primary or a backup owner for the group key, the command is executed locally.
> During state transfer, a node may be a write owner for the group key but not a read owner. Currently {{GroupingInterceptor}} uses {{LocalizedCacheTopogy.isWriteOwner(groupKey)}} instead of {{isReadOwner(groupKey)}}, so it executes the command on the originator instead of forwarding it to the primary owner.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 7 months
[JBoss JIRA] (ISPN-11248) Add test for ServerTask that utilizes protostream stored entries
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-11248?page=com.atlassian.jira.plugi... ]
Tristan Tarrant updated ISPN-11248:
-----------------------------------
Fix Version/s: 11.0.0.CR1
(was: 11.0.0.Dev05)
> Add test for ServerTask that utilizes protostream stored entries
> ----------------------------------------------------------------
>
> Key: ISPN-11248
> URL: https://issues.redhat.com/browse/ISPN-11248
> Project: Infinispan
> Issue Type: Task
> Reporter: Will Burns
> Priority: Major
> Fix For: 11.0.0.CR1
>
>
> We currently have a single ServerTask that just prints out "Hello <name>". We should add a new test that actually tests a very likely use case of storing entries in protostream and using a task that deserializes them properly (this could be done automatically by the encoding layer of the cache, if it is working).
> We may also want to look into sharing this with a custom loader, as we have users doing this today. And it is quite clunky, so we can see how the usability is in 11.0.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 7 months
[JBoss JIRA] (ISPN-11519) Cache should not start if it cluster listener replication fails
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-11519?page=com.atlassian.jira.plugi... ]
Tristan Tarrant updated ISPN-11519:
-----------------------------------
Fix Version/s: 11.0.0.CR1
(was: 11.0.0.Dev05)
> Cache should not start if it cluster listener replication fails
> ---------------------------------------------------------------
>
> Key: ISPN-11519
> URL: https://issues.redhat.com/browse/ISPN-11519
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 10.1.5.Final, 11.0.0.Dev03
> Reporter: Dan Berindei
> Assignee: Will Burns
> Priority: Major
> Fix For: 11.0.0.CR1
>
>
> {{StateConsumerImpl.fetchClusterListeners}} catches any exceptions during the fetch and local installation of cluster listeners from other nodes, and only logs a warning message:
> {noformat}
> 18:04:14,069 WARN (jgroups-5,Test-NodeD:[]) [StateConsumerImpl] ISPN000284: Problem encountered while installing cluster listener
> {noformat}
> If a cache starts without installing all the cluster listeners locally, the listeners will miss events for keys that end up with the joiner as the primary owner, which would be pretty hard to debug. We should instea fail fast, and prevent the cache from starting if the cluster listeners cannot be fetched and installed locally.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 7 months
[JBoss JIRA] (ISPN-11605) Remove TopologyUpdateStableCommand
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-11605?page=com.atlassian.jira.plugi... ]
Tristan Tarrant updated ISPN-11605:
-----------------------------------
Fix Version/s: 11.0.0.CR1
(was: 11.0.0.Dev05)
> Remove TopologyUpdateStableCommand
> ----------------------------------
>
> Key: ISPN-11605
> URL: https://issues.redhat.com/browse/ISPN-11605
> Project: Infinispan
> Issue Type: Task
> Components: Core
> Affects Versions: 11.0.0.Dev03
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
> Fix For: 11.0.0.CR1
>
>
> Having a separate command for stable topologies means nodes may receive a topology update that should be stable without also receiving the confirmation that the topology is stable, complicating cluster recovery.
> As an example, when the coordinator node shuts down, it first installs a new cache topology without itself for all of the running caches.
> If the number of remaining owners is {{< numOwners}}, no rebalance is needed, and the old coordinator immediately sends a stable topology update as well.
> But any (stable) topology updates from the old coordinator not yet processed by the time of the {{CacheStatusRequestCommand}} from the new coordinator will be ignored.
> If the cluster had only 2 nodes, as in {{StatefulSetRollingUpgradeTest}} and {{StatefulSetRollingUpgradeIT}}, the stable topology update is vital to keep the cache in {{AVAILABLE}} mode, otherwise it goes into {{DEGRADED}} mode and no new nodes can join.
> {noformat}
> 19:48:14,671 TRACE (jgroups-4,Test-NodeE:[]) [JGroupsTransport] Test-NodeE received command from Test-NodeD: ConsistentHashUpdateCommand{cacheName='testCache', origin=null, currentCH=DefaultConsistentHash{ns=256, owners = (1)[Test-NodeE: 256+0]}, pendingCH=null, phase=NO_REBALANCE, actualMembers=[Test-NodeE], persistentUUIDs=[72b17309-62d1-4928-abf6-88a8606ef342], availabilityMode=AVAILABLE, rebalanceId=5, topologyId=27, viewId=7}
> 19:48:14,672 TRACE (jgroups-5,Test-NodeE:[]) [JGroupsTransport] Test-NodeE received command from Test-NodeD: StableTopologyUpdateCommand{cacheName='testCache', origin=null, currentCH=DefaultConsistentHash{ns=256, owners = (1)[Test-NodeE: 256+0]}, pendingCH=null, actualMembers=[Test-NodeE], persistentUUIDs=[72b17309-62d1-4928-abf6-88a8606ef342], rebalanceId=5, topologyId=27, viewId=7}
> 19:48:14,675 TRACE (non-blocking-thread-Test-NodeE-p9541-t5:[]) [LocalTopologyManagerImpl] Acquired cache status testCache
> 19:48:14,675 DEBUG (non-blocking-thread-Test-NodeE-p9541-t5:[]) [LocalTopologyManagerImpl] Updating local topology for cache testCache: CacheTopology{id=27, phase=NO_REBALANCE, rebalanceId=5, currentCH=DefaultConsistentHash{ns=256, owners = (1)[Test-NodeE: 256+0]}, pendingCH=null, unionCH=null, actualMembers=[Test-NodeE], persistentUUIDs=[72b17309-62d1-4928-abf6-88a8606ef342]}
> 19:48:14,686 INFO (jgroups-5,Test-NodeE:[]) [CLUSTER] ISPN000094: Received new cluster view for channel org.infinispan.statetransfer.Test: [Test-NodeE|8] (1) [Test-NodeE]
> 19:48:14,690 TRACE (non-blocking-thread-Test-NodeE-p9541-t5:[]) [LocalTopologyManagerImpl] Released cache status testCache
> 19:48:14,690 TRACE (non-blocking-thread-Test-NodeE-p9541-t5:[Merge-8]) [PreferConsistencyStrategy] Max stable partition topology: CacheTopology{id=25, phase=NO_REBALANCE, rebalanceId=5, currentCH=DefaultConsistentHash{ns=256, owners = (2)[Test-NodeD: 129+127, Test-NodeE: 127+129]}, pendingCH=null, unionCH=null, actualMembers=[Test-NodeD, Test-NodeE], persistentUUIDs=[099401f9-c0b6-460d-8dc8-5051699e8287, 72b17309-62d1-4928-abf6-88a8606ef342]}
> 19:48:14,690 TRACE (non-blocking-thread-Test-NodeE-p9541-t5:[Merge-8]) [PreferConsistencyStrategy] Max active partition topology: CacheTopology{id=27, phase=NO_REBALANCE, rebalanceId=5, currentCH=DefaultConsistentHash{ns=256, owners = (1)[Test-NodeE: 256+0]}, pendingCH=null, unionCH=null, actualMembers=[Test-NodeE], persistentUUIDs=[72b17309-62d1-4928-abf6-88a8606ef342]}
> 19:48:14,690 TRACE (non-blocking-thread-Test-NodeE-p9541-t3:[]) [LocalTopologyManagerImpl] Acquired cache status testCache
> 19:48:14,690 DEBUG (non-blocking-thread-Test-NodeE-p9541-t3:[]) [LocalTopologyManagerImpl] Ignoring topology 27 for cache testCache from old coordinator Test-NodeD
> 19:48:14,690 TRACE (non-blocking-thread-Test-NodeE-p9541-t3:[]) [LocalTopologyManagerImpl] Released cache status testCache
> 19:48:14,704 TRACE (non-blocking-thread-Test-NodeE-p9541-t5:[Merge-8]) [ClusterCacheStatus] Cache testCache availability changed: AVAILABLE -> DEGRADED_MODE
> 19:48:25,194 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.statetransfer.StatefulSetRollingUpgradeTest.testStateTransferRestart[nodes=2]
> org.infinispan.commons.CacheException: Initial state transfer timed out for cache testCache on Test-NodeF
> at org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete(StateTransferManagerImpl.java:240) ~[classes/:?]
> at org.infinispan.cache.impl.CacheImpl.start(CacheImpl.java:1049) ~[classes/:?]
> at org.infinispan.cache.impl.AbstractDelegatingCache.start(AbstractDelegatingCache.java:513) ~[classes/:?]
> at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:692) ~[classes/:?]
> at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:631) ~[classes/:?]
> at org.infinispan.manager.DefaultCacheManager.internalGetCache(DefaultCacheManager.java:516) ~[classes/:?]
> at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:497) ~[classes/:?]
> at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:490) ~[classes/:?]
> at org.infinispan.test.MultipleCacheManagersTest.getCaches(MultipleCacheManagersTest.java:322) ~[test-classes/:?]
> at org.infinispan.test.MultipleCacheManagersTest.waitForClusterToForm(MultipleCacheManagersTest.java:329) ~[test-classes/:?]
> at org.infinispan.statetransfer.StatefulSetRollingUpgradeTest.testStateTransferRestart(StatefulSetRollingUpgradeTest.java:78) ~[test-classes/:?]
> {noformat}
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 7 months