[JBoss JIRA] (ISPN-2445) Safe shutdown - transfer data to other nodes
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-2445?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-2445:
----------------------------------
Sprint: Sprint 9.4.0.Beta1 (was: Sprint 9.4.0.Alpha1)
> Safe shutdown - transfer data to other nodes
> --------------------------------------------
>
> Key: ISPN-2445
> URL: https://issues.jboss.org/browse/ISPN-2445
> Project: Infinispan
> Issue Type: Feature Request
> Components: Core
> Reporter: Matej Lazar
> Assignee: Dan Berindei
>
> Distributed cache should provide an configuration option to block shutdown until data from shutting down node is transferred to other nodes.
> This feature is useful if Infinispan is used as persistent storage.
> Without this option, in a cloud environment, where server node is deleted on scale down event, users are forced to use external storage, if they don't want to loose data.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 8 months
[JBoss JIRA] (ISPN-8691) Infinispan rejects to read cache file bigger than 2147483647 (Integer.MAX_VALUE)
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-8691?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-8691:
----------------------------------
Sprint: Sprint 9.4.0.CR1 (was: Sprint 9.4.0.Alpha1)
> Infinispan rejects to read cache file bigger than 2147483647 (Integer.MAX_VALUE)
> --------------------------------------------------------------------------------
>
> Key: ISPN-8691
> URL: https://issues.jboss.org/browse/ISPN-8691
> Project: Infinispan
> Issue Type: Bug
> Components: Loaders and Stores
> Affects Versions: 9.1.1.Final
> Reporter: Dmitry Katsubo
> Assignee: Ryan Emerson
> Priority: Minor
> Fix For: 9.4.0.Final
>
>
> In my scenario the cache file size created by {{SingleFileStore}} is 3.054.196.342 bytes. When this file is tried to be loaded, it fails with the following exception:
> {code}
> Caused by: org.infinispan.persistence.spi.PersistenceException: ISPN000279: Failed to read stored entries from file. Error in file /work/search-service-layer_data/infinispan/cache_test_dk83146/markupCache.dat at offset 4
> at org.infinispan.persistence.file.SingleFileStore.rebuildIndex(SingleFileStore.java:182)
> at org.infinispan.persistence.file.SingleFileStore.start(SingleFileStore.java:127)
> ... 155 more
> {code}
> Cache file content:
> {code}
> 0000000000: 46 43 53 31 80 B1 89 47 │ 00 00 00 00 00 00 00 00 FCS1?+%G
> 0000000010: 00 00 00 00 FF FF FF FF │ FF FF FF FF 02 15 4E 06 yyyyyyyy☻§N♠
> 0000000020: 05 03 04 09 00 00 00 2F │ 6F 72 67 2E 73 70 72 69 ♣♥♦○ /org.spri
> 0000000030: 6E 67 66 72 61 6D 65 77 │ 6F 72 6B 2E 63 61 63 68 ngframework.cach
> 0000000040: 65 2E 69 6E 74 65 72 63 │ 65 70 74 6F 72 2E 53 69 e.interceptor.Si
> 0000000050: 6D 70 6C 65 4B 65 79 4C │ 0A 57 03 6B 6D 93 D8 00 mpleKeyL◙W♥km"O
> 0000000060: 00 00 02 00 00 00 08 68 │ 61 73 68 43 6F 64 65 23 ☻ ◘hashCode#
> 0000000070: 00 00 00 00 06 70 61 72 │ 61 6D 73 16 00 16 15 E6 ♠params▬ ▬§?
> {code}
> The problem is that integer value 0x80B18947 is treated as signed integer in line {{SingleFileStore:181}}, hence in expression
> {code}
> if (fe.size < KEY_POS + fe.keyLen + fe.dataLen + fe.metadataLen) {
> throw log.errorReadingFileStore(file.getPath(), filePos);
> }
> {code}
> {{fe.size}} is negative and equal to -2135848633.
> I have tried to configure the persistence storage so that it gets purged on start:
> {code}
> <persistence passivation="true">
> <file-store path="/var/cache/infinispan" purge="true">
> <write-behind thread-pool-size="5" />
> </file-store>
> </persistence>
> {code}
> however this does not help as storage is first read and then purged (see also ISPN-7186).
> It is expected that {{SingleFileStore}} either does not allow to write such big entries to the cache, or handles them correctly.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 8 months
[JBoss JIRA] (ISPN-9257) ClustertopologyManagerTest.testAbruptLeaveAfterGetStatus2[SCATTERED_SYNC, tx=false] random failures
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-9257?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-9257:
----------------------------------
Sprint: Sprint 9.3.0.Final, Sprint 9.4.0.Beta1 (was: Sprint 9.3.0.Final)
> ClustertopologyManagerTest.testAbruptLeaveAfterGetStatus2[SCATTERED_SYNC, tx=false] random failures
> ---------------------------------------------------------------------------------------------------
>
> Key: ISPN-9257
> URL: https://issues.jboss.org/browse/ISPN-9257
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 9.3.0.CR1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Labels: testsuite_stability
> Fix For: 9.4.0.Final
>
> Attachments: ISPN-8731_wrong_topology_2018-05-18_ClusterTopologyManagerTest-infinispan-core.log.gz
>
>
> The test kills the coordinator NodeA, then while NodeB is trying to recover the caches it also kills NodeC. It expects NodeB to start a rebalance with 2 nodes and discards it, in order to test that it can process the 1-node rebalance first:
> {noformat}
> 00:34:06,582 DEBUG (transport-thread-test-NodeB-p12-t6:[testCache]) [ClusterTopologyManagerTest] Discarding rebalance command CacheTopology{id=8, phase=TRANSITORY, rebalanceId=5, currentCH=ScatteredConsistentHash{ns=256, rebalanced=false, owners = (2)[test-NodeB-49590: 85, test-NodeC-58596: 85]}, pendingCH=ScatteredConsistentHash{ns=256, rebalanced=true, owners = (2)[test-NodeB-49590: 128, test-NodeC-58596: 128]}, unionCH=null, actualMembers=[test-NodeB-49590, test-NodeC-58596], persistentUUIDs=[6b96414e-15d8-4350-aa3c-4fb4fc34e888, d47dc4a9-2a95-4bb1-a83b-bb8a27c9999f]}
> 00:34:06,609 DEBUG (transport-thread-test-NodeB-p12-t2:[Topology-testCache]) [LocalTopologyManagerImpl] Updating local topology for cache testCache: CacheTopology{id=9, phase=TRANSITORY, rebalanceId=5, currentCH=ScatteredConsistentHash{ns=256, rebalanced=false, owners = (1)[test-NodeB-49590: 85]}, pendingCH=ScatteredConsistentHash{ns=256, rebalanced=false, owners = (1)[test-NodeB-49590: 128]}, unionCH=null, actualMembers=[test-NodeB-49590], persistentUUIDs=[6b96414e-15d8-4350-aa3c-4fb4fc34e888]}
> 00:34:06,609 DEBUG (transport-thread-test-NodeB-p12-t2:[Topology-testCache]) [LocalTopologyManagerImpl] Installing fake cache topology CacheTopology{id=8, phase=NO_REBALANCE, rebalanceId=4, currentCH=ScatteredConsistentHash{ns=256, rebalanced=false, owners = (1)[test-NodeB-49590: 85]}, pendingCH=null, unionCH=null, actualMembers=[test-NodeB-49590], persistentUUIDs=[6b96414e-15d8-4350-aa3c-4fb4fc34e888]} for cache testCache
> {noformat}
> Unfortunately {{PreferAvailabilityStrategy}} has changed a bit and the rebalance ids don't always match the expectations of the test, so that the 1-node rebalance is discarded instead:
> {noformat}
> 09:46:10,530 DEBUG (transport-thread-Test-NodeB-p54539-t3:[testCache]) [Test] Discarding rebalance command CacheTopology{id=9, phase=TRANSITORY, rebalanceId=5, currentCH=ScatteredConsistentHash{ns=256, rebalanced=false, owners = (1)[Test-NodeB-62039: 85]}, pendingCH=ScatteredConsistentHash{ns=256, rebalanced=true, owners = (1)[Test-NodeB-62039: 256]}, unionCH=null, actualMembers=[Test-NodeB-62039], persistentUUIDs=[0ed7be74-4485-489b-baee-28c461c9e5de]}
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 8 months
[JBoss JIRA] (ISPN-9270) ClusteredLockWith2NodesTest.testTryLockAndKillLocking random failures
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-9270?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-9270:
----------------------------------
Sprint: Sprint 9.3.0.Final, Sprint 9.4.0.Beta1 (was: Sprint 9.3.0.Final)
> ClusteredLockWith2NodesTest.testTryLockAndKillLocking random failures
> ---------------------------------------------------------------------
>
> Key: ISPN-9270
> URL: https://issues.jboss.org/browse/ISPN-9270
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 9.3.0.CR1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Labels: testsuite_stability
> Fix For: 9.4.0.Final
>
>
> The test stops the coordinator A of a 2-node cluster \[A, B\], and then verifies that B stays available.
> Unfortunately that doesn't always happen: A tries to install topology \[B\] before shutting down, but B might receive it too late and reject it. If B expects members \[A, B\] and sees only B, the cache will enter degraded mode:
> {noformat}
> 19:04:05,863 INFO (jgroups-8,Test-NodeB-21374:[org.infinispan.LOCKS]) [CLUSTER] [Context=org.infinispan.LOCKS] ISPN100008: Updating cache members list [Test-NodeD-51051], topology id 7
> 19:04:05,863 TRACE (jgroups-4,Test-NodeD-51051:[]) [JGroupsTransport] Test-NodeD-51051 received command from Test-NodeB-21374: CacheTopologyControlCommand{cache=org.infinispan.LOCKS, type=CH_UPDATE, sender=Test-NodeB-21374, joinInfo=null, topologyId=7, rebalanceId=2, currentCH=ReplicatedConsistentHash{ns = 256, owners = (1)[Test-NodeD-51051: 256]}, pendingCH=null, availabilityMode=AVAILABLE, phase=NO_REBALANCE, actualMembers=[Test-NodeD-51051], throwable=null, viewId=1}
> 19:04:05,880 INFO (jgroups-4,Test-NodeD-51051:[]) [CLUSTER] ISPN000094: Received new cluster view for channel ISPN: [Test-NodeD-51051|2] (1) [Test-NodeD-51051]
> 19:04:05,884 DEBUG (stateTransferExecutor-thread-Test-NodeD-p101-t1:[Merge-2]) [ClusterCacheStatus] Recovered 1 partition(s) for cache org.infinispan.LOCKS: [CacheTopology{id=6, phase=NO_REBALANCE, rebalanceId=2, currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeB-21374: 124, Test-NodeD-51051: 132]}, pendingCH=null, unionCH=null, actualMembers=[Test-NodeD-51051], persistentUUIDs=[6ac3a341-5af4-4aa2-ae7b-a4183d02d959]}]
> 19:04:05,884 WARN (stateTransferExecutor-thread-Test-NodeD-p101-t1:[Merge-2]) [CLUSTER] [Context=org.infinispan.LOCKS] ISPN000320: After merge (or coordinator change), cache still hasn't recovered a majority of members and must stay in degraded mode. Current members are [Test-NodeD-51051], lost members are [Test-NodeB-21374], stable members are [Test-NodeB-21374, Test-NodeD-51051]
> 19:04:05,885 INFO (stateTransferExecutor-thread-Test-NodeD-p101-t1:[Merge-2]) [CLUSTER] [Context=org.infinispan.LOCKS] ISPN100011: Entering availability mode DEGRADED_MODE, topology id 8
> 19:04:05,941 DEBUG (transport-thread-Test-NodeD-p100-t2:[Topology-org.infinispan.LOCKS]) [LocalTopologyManagerImpl] Ignoring topology 7 for cache org.infinispan.LOCKS from old coordinator Test-NodeB-21374
> 19:04:06,024 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.lock.ClusteredLockWith2NodesTest.testTryLockAndKillLocking
> java.lang.Error: java.util.concurrent.ExecutionException: java.lang.Error: java.util.concurrent.ExecutionException: org.infinispan.lock.exception.ClusteredLockException: org.infinispan.partitionhandling.AvailabilityException: ISPN000306: Key 'ClusteredLockKey{name=Test}' is not available. Not all owners are in this partition
> at org.infinispan.functional.FunctionalTestUtils.await(FunctionalTestUtils.java:47) ~[infinispan-core-9.3.0-SNAPSHOT-tests.jar:9.3.0-SNAPSHOT]
> at org.infinispan.lock.ClusteredLockWith2NodesTest.testTryLockAndKillLocking(ClusteredLockWith2NodesTest.java:49) ~[test-classes/:?]
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 8 months
[JBoss JIRA] (ISPN-9276) FunctionalEncodingTypeTest.testDistReturnViewFromReadWriteEvalOnNonOwner[tx=true] always fails
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-9276?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-9276:
----------------------------------
Sprint: Sprint 9.3.0.Final, Sprint 9.4.0.Beta1 (was: Sprint 9.3.0.Final)
> FunctionalEncodingTypeTest.testDistReturnViewFromReadWriteEvalOnNonOwner[tx=true] always fails
> ----------------------------------------------------------------------------------------------
>
> Key: ISPN-9276
> URL: https://issues.jboss.org/browse/ISPN-9276
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 9.3.0.CR1
> Reporter: Dan Berindei
> Assignee: Radim Vansa
> Priority: Critical
> Labels: testsuite_stability
> Fix For: 9.4.0.Final
>
>
> The test is not currently running during the build because of ISPN-9149, but fails when run manually:
> {noformat}
> java.lang.Error: java.util.concurrent.ExecutionException: org.infinispan.remoting.RemoteException: ISPN000217: Received exception from FunctionalEncodingTypeTest[tx=true]-NodeB-35039, see cause for remote stack trace
> at org.infinispan.functional.FunctionalTestUtils.await(FunctionalTestUtils.java:47)
> at org.infinispan.functional.FunctionalMapTest.doReturnViewFromReadWriteEval(FunctionalMapTest.java:595)
> at org.infinispan.functional.FunctionalMapTest.testDistReturnViewFromReadWriteEvalOnNonOwner(FunctionalMapTest.java:584)
> Caused by: org.infinispan.remoting.RemoteException: ISPN000217: Received exception from FunctionalEncodingTypeTest[tx=true]-NodeB-35039, see cause for remote stack trace
> at org.infinispan.remoting.transport.ResponseCollectors.wrapRemoteException(ResponseCollectors.java:27)
> at org.infinispan.remoting.transport.RemoteGetResponseCollector.addResponse(RemoteGetResponseCollector.java:26)
> at org.infinispan.remoting.transport.RemoteGetResponseCollector.addResponse(RemoteGetResponseCollector.java:17)
> at org.infinispan.remoting.transport.impl.MultiTargetRequest.onResponse(MultiTargetRequest.java:91)
> at org.infinispan.remoting.transport.impl.RequestRepository.addResponse(RequestRepository.java:52)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processResponse(JGroupsTransport.java:1364)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processMessage(JGroupsTransport.java:1267)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.access$300(JGroupsTransport.java:125)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport$ChannelCallbacks.up(JGroupsTransport.java:1412)
> at org.jgroups.JChannel.up(JChannel.java:816)
> Caused by: org.infinispan.commons.marshall.NotSerializableException: org.infinispan.functional.impl.EntryViews$EntryBackedReadWriteView
> Caused by: an exception which occurred:
> in object org.infinispan.functional.impl.EntryViews$EntryBackedReadWriteView@72d6a840
> -> toString = EntryBackedReadWriteView{entry=VersionedRepeatableReadEntry(39ed5af7){key=TestKey#MagicKey{778/3FB8CBF3/178@FunctionalEncodingTypeTest[tx=true]-NodeB-35039}, value=TestValue#one, isCreated=true, isChanged=true, isRemoved=false, isExpired=false, skipLookup=true, metadata=MetaParamsInternalMetadata{params=MetaParams{length=1, metas=[MetaEntryVersion=SimpleClusteredVersion{topologyId=0, version=0}]}}}}
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 8 months