[JBoss JIRA] (ISPN-2575) Key Transformer registration is required on all nodes of the cluster, in case of custom keys
by Tristan Tarrant (Jira)
[ https://issues.jboss.org/browse/ISPN-2575?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-2575:
----------------------------------
Fix Version/s: 10.0.0.Alpha1
(was: 9.4.1.Final)
> Key Transformer registration is required on all nodes of the cluster, in case of custom keys
> --------------------------------------------------------------------------------------------
>
> Key: ISPN-2575
> URL: https://issues.jboss.org/browse/ISPN-2575
> Project: Infinispan
> Issue Type: Enhancement
> Components: Embedded Querying
> Affects Versions: 5.2.0.Beta5
> Reporter: Anna Manukyan
> Assignee: Adrian Nistor
> Priority: Minor
> Fix For: 10.0.0.Alpha1
>
> Attachments: ClusteredCacheTest.java
>
>
> The case is the following:
> I have a clustered cache on which I want to perform a search. I'm doing the following:
> I'm initializing SearchManager on node1, I'm registering a custom key transformer for my key using the created searchmanager, but then when I'm trying to put data into the cache on node1 (which is in REPL_SYNC mode with cache on node2), I'm getting the exception:
> java.lang.IllegalArgumentException: Indexing only works with entries keyed on Strings, primitives and classes that have the @Transformable annotation - you passed in a class org.infinispan.query.test.CustomKey3. Alternatively, see org.infinispan.query.SearchManager#registerKeyTransformer
> When I'm initializing the SearchManager using node2 cache and register the keyTransformer on it as well, then everything works perfectly, even though I am not using the second created SearchManager.
> The test which reproduces the issue is attached to the jira. Please see the test case: testSearchKeyTransformer()
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 6 months
[JBoss JIRA] (ISPN-8691) Infinispan rejects to read cache file bigger than 2147483647 (Integer.MAX_VALUE)
by Tristan Tarrant (Jira)
[ https://issues.jboss.org/browse/ISPN-8691?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-8691:
----------------------------------
Sprint: Sprint 9.4.0.CR1, Sprint 9.4.0.CR3, Sprint 10.0.0.Alpha1, Sprint 9.4.0.Final (was: Sprint 9.4.0.CR1, Sprint 9.4.0.CR3, Sprint 9.4.0.Final, Sprint 10.0.0.Alpha0)
> Infinispan rejects to read cache file bigger than 2147483647 (Integer.MAX_VALUE)
> --------------------------------------------------------------------------------
>
> Key: ISPN-8691
> URL: https://issues.jboss.org/browse/ISPN-8691
> Project: Infinispan
> Issue Type: Bug
> Components: Loaders and Stores
> Affects Versions: 9.1.1.Final
> Reporter: Dmitry Katsubo
> Assignee: Ryan Emerson
> Priority: Minor
> Fix For: 9.4.1.Final
>
>
> In my scenario the cache file size created by {{SingleFileStore}} is 3.054.196.342 bytes. When this file is tried to be loaded, it fails with the following exception:
> {code}
> Caused by: org.infinispan.persistence.spi.PersistenceException: ISPN000279: Failed to read stored entries from file. Error in file /work/search-service-layer_data/infinispan/cache_test_dk83146/markupCache.dat at offset 4
> at org.infinispan.persistence.file.SingleFileStore.rebuildIndex(SingleFileStore.java:182)
> at org.infinispan.persistence.file.SingleFileStore.start(SingleFileStore.java:127)
> ... 155 more
> {code}
> Cache file content:
> {code}
> 0000000000: 46 43 53 31 80 B1 89 47 │ 00 00 00 00 00 00 00 00 FCS1?+%G
> 0000000010: 00 00 00 00 FF FF FF FF │ FF FF FF FF 02 15 4E 06 yyyyyyyy☻§N♠
> 0000000020: 05 03 04 09 00 00 00 2F │ 6F 72 67 2E 73 70 72 69 ♣♥♦○ /org.spri
> 0000000030: 6E 67 66 72 61 6D 65 77 │ 6F 72 6B 2E 63 61 63 68 ngframework.cach
> 0000000040: 65 2E 69 6E 74 65 72 63 │ 65 70 74 6F 72 2E 53 69 e.interceptor.Si
> 0000000050: 6D 70 6C 65 4B 65 79 4C │ 0A 57 03 6B 6D 93 D8 00 mpleKeyL◙W♥km"O
> 0000000060: 00 00 02 00 00 00 08 68 │ 61 73 68 43 6F 64 65 23 ☻ ◘hashCode#
> 0000000070: 00 00 00 00 06 70 61 72 │ 61 6D 73 16 00 16 15 E6 ♠params▬ ▬§?
> {code}
> The problem is that integer value 0x80B18947 is treated as signed integer in line {{SingleFileStore:181}}, hence in expression
> {code}
> if (fe.size < KEY_POS + fe.keyLen + fe.dataLen + fe.metadataLen) {
> throw log.errorReadingFileStore(file.getPath(), filePos);
> }
> {code}
> {{fe.size}} is negative and equal to -2135848633.
> I have tried to configure the persistence storage so that it gets purged on start:
> {code}
> <persistence passivation="true">
> <file-store path="/var/cache/infinispan" purge="true">
> <write-behind thread-pool-size="5" />
> </file-store>
> </persistence>
> {code}
> however this does not help as storage is first read and then purged (see also ISPN-7186).
> It is expected that {{SingleFileStore}} either does not allow to write such big entries to the cache, or handles them correctly.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 6 months
[JBoss JIRA] (ISPN-9589) Client Listener notifications can deadlock
by Tristan Tarrant (Jira)
[ https://issues.jboss.org/browse/ISPN-9589?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-9589:
----------------------------------
Sprint: Sprint 10.0.0.Alpha0
> Client Listener notifications can deadlock
> ------------------------------------------
>
> Key: ISPN-9589
> URL: https://issues.jboss.org/browse/ISPN-9589
> Project: Infinispan
> Issue Type: Bug
> Components: Listeners, Server
> Reporter: William Burns
> Priority: Critical
>
> When a node does a write and it is the primary owner it will send the client listener event. Unfortunately this can block waiting to put in the eventQueue. Unfortunately the thread that is executing this method can be the event loop for that listener, which in turn can cause it to block forever. I believe this came about after ISPN-8621.
> I have found a few ways to fix this at the moment:
> # Make the client listener async - this way it is fired on the notification thread
> # Change the listener event to check if the current thread is the event loop and subsequently fire the events directly instead of submitting it to the Executor.
> # Make the various write operations still use the worker thread pool - this was changed in the above mentioned JIRA so I am guessing we don't want to do that
> The former of the fixes seems to be much more performant, but I don't know if I like that fix. The latter still has hiccups since it reduces availability of threads to respond to socket requests.
> This is very easy to reproduce by running the ClientEventStressTest test on master as is since it only has 3 io threads (but the default is CPU * 2 - so in some cases this isn't that much)
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 6 months
[JBoss JIRA] (ISPN-2445) Safe shutdown - transfer data to other nodes
by Tristan Tarrant (Jira)
[ https://issues.jboss.org/browse/ISPN-2445?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-2445:
----------------------------------
Sprint: Sprint 9.4.0.Beta1, Sprint 9.4.0.CR1, Sprint 9.4.0.CR3, Sprint 10.0.0.Alpha1 (was: Sprint 9.4.0.Beta1, Sprint 9.4.0.CR1, Sprint 9.4.0.CR3, Sprint 10.0.0.Alpha0)
> Safe shutdown - transfer data to other nodes
> --------------------------------------------
>
> Key: ISPN-2445
> URL: https://issues.jboss.org/browse/ISPN-2445
> Project: Infinispan
> Issue Type: Feature Request
> Components: Core
> Reporter: Matej Lazar
> Assignee: Dan Berindei
> Priority: Major
>
> Distributed cache should provide an configuration option to block shutdown until data from shutting down node is transferred to other nodes.
> This feature is useful if Infinispan is used as persistent storage.
> Without this option, in a cloud environment, where server node is deleted on scale down event, users are forced to use external storage, if they don't want to loose data.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 6 months
[JBoss JIRA] (ISPN-9588) JGroups fails to install new cluster view after coordinator leave
by Tristan Tarrant (Jira)
[ https://issues.jboss.org/browse/ISPN-9588?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-9588:
----------------------------------
Sprint: Sprint 10.0.0.Alpha0 (was: Sprint 10.0.0.Alpha1)
> JGroups fails to install new cluster view after coordinator leave
> -----------------------------------------------------------------
>
> Key: ISPN-9588
> URL: https://issues.jboss.org/browse/ISPN-9588
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 9.4.0.CR3
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 10.0.0.Alpha1
>
> Attachments: ISPN-9127_ScatteredCrashInSequenceTest-infinispan-core.log.gz, ISPN-9127_ThreeWaySplitAndMergeTest-infinispan-core.log.gz
>
>
> The core test suite normally shuts down cache managers in the reverse order of their start, so that the coordinator stops last. This should be just an optimization, to reduce the number of coordinator changes, and with it the test suite duration and log size.
> Unfortunately, the few tests that stop the coordinator first are routinely failing to stop the cluster properly, usually when there are 2 nodes left in the cluster the 2nd node doesn't receive the view:
> {noformat}
> 10:08:01,656 DEBUG (jgroups-26,Test-NodeB-20661:[]) [GMS] Test-NodeB-20661: installing view [Test-NodeC-25266|33] (3) [Test-NodeC-25266, Test-NodeA-8936, Test-NodeB-20661]
> 10:08:01,862 TRACE (testng-Test:[]) [GMS] Test-NodeC-25266: sending LEAVE request to Test-NodeA-8936
> 10:08:01,863 DEBUG (jgroups-6,Test-NodeA-8936:[]) [GMS] Test-NodeA-8936: members are (3) Test-NodeC-25266,Test-NodeA-8936,Test-NodeB-20661, coord=Test-NodeA-8936: I'm the new coordinator
> 10:08:01,896 TRACE (jgroups-6,Test-NodeA-8936:[]) [GMS] Test-NodeA-8936: joiners=[], suspected=[], leaving=[Test-NodeC-25266], new view: [Test-NodeA-8936|34] (2) [Test-NodeA-8936, Test-NodeB-20661]
> 10:08:01,896 TRACE (jgroups-6,Test-NodeA-8936:[]) [GMS] Test-NodeA-8936: mcasting view [Test-NodeA-8936|34] (2) [Test-NodeA-8936, Test-NodeB-20661]
> 10:08:01,900 DEBUG (jgroups-6,Test-NodeA-8936:[]) [GMS] Test-NodeA-8936: installing view [Test-NodeA-8936|34] (2) [Test-NodeA-8936, Test-NodeB-20661]
> 10:08:02,245 TRACE (testng-Test:[]) [JGroupsTransport] Test-NodeB-20661 sending request 140 to Test-NodeC-25266: CacheTopologyControlCommand{cache=org.infinispan.CONFIG, type=LEAVE, sender=Test-NodeB-20661, joinInfo=null, topologyId=0, rebalanceId=0, currentCH=null, pendingCH=null, availabilityMode=null, phase=null, actualMembers=null, throwable=null, viewId=33}
> 10:12:02,247 DEBUG (testng-Test:[]) [LocalTopologyManagerImpl] Error sending the leave request for cache org.infinispan.CONFIG to coordinator
> org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 140 from 7d5d965d-eacc-fd47-2ab8-5cec95f4c10b
> {noformat}
> Sometimes the new coordinator gets stuck in a merge loop, keeps trying to re-include the stopped node and fails:
> {noformat}
> 10:06:21,484 DEBUG (jgroups-24,Test-NodeC-57941:[]) [GMS] Test-NodeC-57941: installing view [Test-NodeD-37696|8] (3) [Test-NodeD-37696, Test-NodeB-39137, Test-NodeC-57941]
> 10:06:21,486 DEBUG (jgroups-25,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: installing view [Test-NodeD-37696|8] (3) [Test-NodeD-37696, Test-NodeB-39137, Test-NodeC-57941]
> 10:06:22,070 TRACE (testng-Test:[]) [GMS] Test-NodeD-37696: sending LEAVE request to Test-NodeB-39137
> 10:06:22,078 TRACE (jgroups-10,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: mcasting view [Test-NodeB-39137|9] (2) [Test-NodeB-39137, Test-NodeC-57941]
> 10:06:22,080 DEBUG (jgroups-10,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: installing view [Test-NodeB-39137|9] (2) [Test-NodeB-39137, Test-NodeC-57941]
> 10:06:30,887 DEBUG (jgroups-24,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: I will be the merge leader. Starting the merge task. Views: {Test-NodeB-39137=[Test-NodeB-39137|9] (2) [Test-NodeB-39137, Test-NodeC-57941], Test-NodeC-57941=[Test-NodeD-37696|8] (3) [Test-NodeD-37696, Test-NodeB-39137, Test-NodeC-57941]}
> 10:06:30,888 DEBUG (MergeTask-108,Test-NodeB-39137:[]) [STABLE] suspending message garbage collection
> 10:06:30,888 DEBUG (MergeTask-108,Test-NodeB-39137:[]) [STABLE] Test-NodeB-39137: resume task started, max_suspend_time=220000
> 10:06:30,888 DEBUG (MergeTask-108,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge task Test-NodeB-39137::7 started with 2 participants
> 10:06:30,888 TRACE (MergeTask-108,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: sending MERGE_REQ to [Test-NodeD-37696, Test-NodeB-39137]
> 10:06:36,041 TRACE (MergeTask-108,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: collected 1 merge response(s) in 5153 ms
> 10:06:36,041 DEBUG (MergeTask-108,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:06:36,043 TRACE (MergeTask-108,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: consolidated view=MergeView::[Test-NodeB-39137|10] (2) [Test-NodeB-39137, Test-NodeC-57941], 1 subgroups: [Test-NodeB-39137|9] (2) [Test-NodeB-39137, Test-NodeC-57941]
> consolidated digest=Test-NodeB-39137: [15 (16)], Test-NodeC-57941: [5 (5)]
> 10:06:36,043 DEBUG (MergeTask-108,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: installing merge view [Test-NodeB-39137|10] (2 members) in 1 coords
> 10:06:36,043 DEBUG (MergeTask-108,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge Test-NodeB-39137::7 took 5155 ms
> 10:06:36,043 TRACE (jgroups-24,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: mcasting view MergeView::[Test-NodeB-39137|10] (2) [Test-NodeB-39137, Test-NodeC-57941], 1 subgroups: [Test-NodeB-39137|9] (2) [Test-NodeB-39137, Test-NodeC-57941]
> 10:06:49,877 DEBUG (MergeTask-229,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:07:03,755 DEBUG (MergeTask-342,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:07:17,703 DEBUG (MergeTask-447,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:07:31,547 DEBUG (MergeTask-568,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:07:45,381 DEBUG (MergeTask-688,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:07:59,196 DEBUG (MergeTask-806,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:08:13,031 DEBUG (MergeTask-932,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:08:26,885 DEBUG (MergeTask-1061,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:08:40,697 DEBUG (MergeTask-1197,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:08:54,510 DEBUG (MergeTask-1330,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:09:08,324 DEBUG (MergeTask-1463,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:09:22,135 DEBUG (MergeTask-1596,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:09:35,953 DEBUG (MergeTask-1729,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:09:49,770 DEBUG (MergeTask-1857,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:10:03,609 DEBUG (MergeTask-1986,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:10:17,420 DEBUG (MergeTask-2117,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:10:31,252 DEBUG (MergeTask-2248,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:10:45,099 DEBUG (MergeTask-2383,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:10:58,916 DEBUG (MergeTask-2516,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:11:12,743 DEBUG (MergeTask-2648,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> {noformat}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 6 months
[JBoss JIRA] (ISPN-9257) ClustertopologyManagerTest.testAbruptLeaveAfterGetStatus2[SCATTERED_SYNC, tx=false] random failures
by Tristan Tarrant (Jira)
[ https://issues.jboss.org/browse/ISPN-9257?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-9257:
----------------------------------
Priority: Minor (was: Critical)
> ClustertopologyManagerTest.testAbruptLeaveAfterGetStatus2[SCATTERED_SYNC, tx=false] random failures
> ---------------------------------------------------------------------------------------------------
>
> Key: ISPN-9257
> URL: https://issues.jboss.org/browse/ISPN-9257
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 9.3.0.CR1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Minor
> Labels: testsuite_stability
> Fix For: 9.4.1.Final
>
> Attachments: ISPN-8731_wrong_topology_2018-05-18_ClusterTopologyManagerTest-infinispan-core.log.gz
>
>
> The test kills the coordinator NodeA, then while NodeB is trying to recover the caches it also kills NodeC. It expects NodeB to start a rebalance with 2 nodes and discards it, in order to test that it can process the 1-node rebalance first:
> {noformat}
> 00:34:06,582 DEBUG (transport-thread-test-NodeB-p12-t6:[testCache]) [ClusterTopologyManagerTest] Discarding rebalance command CacheTopology{id=8, phase=TRANSITORY, rebalanceId=5, currentCH=ScatteredConsistentHash{ns=256, rebalanced=false, owners = (2)[test-NodeB-49590: 85, test-NodeC-58596: 85]}, pendingCH=ScatteredConsistentHash{ns=256, rebalanced=true, owners = (2)[test-NodeB-49590: 128, test-NodeC-58596: 128]}, unionCH=null, actualMembers=[test-NodeB-49590, test-NodeC-58596], persistentUUIDs=[6b96414e-15d8-4350-aa3c-4fb4fc34e888, d47dc4a9-2a95-4bb1-a83b-bb8a27c9999f]}
> 00:34:06,609 DEBUG (transport-thread-test-NodeB-p12-t2:[Topology-testCache]) [LocalTopologyManagerImpl] Updating local topology for cache testCache: CacheTopology{id=9, phase=TRANSITORY, rebalanceId=5, currentCH=ScatteredConsistentHash{ns=256, rebalanced=false, owners = (1)[test-NodeB-49590: 85]}, pendingCH=ScatteredConsistentHash{ns=256, rebalanced=false, owners = (1)[test-NodeB-49590: 128]}, unionCH=null, actualMembers=[test-NodeB-49590], persistentUUIDs=[6b96414e-15d8-4350-aa3c-4fb4fc34e888]}
> 00:34:06,609 DEBUG (transport-thread-test-NodeB-p12-t2:[Topology-testCache]) [LocalTopologyManagerImpl] Installing fake cache topology CacheTopology{id=8, phase=NO_REBALANCE, rebalanceId=4, currentCH=ScatteredConsistentHash{ns=256, rebalanced=false, owners = (1)[test-NodeB-49590: 85]}, pendingCH=null, unionCH=null, actualMembers=[test-NodeB-49590], persistentUUIDs=[6b96414e-15d8-4350-aa3c-4fb4fc34e888]} for cache testCache
> {noformat}
> Unfortunately {{PreferAvailabilityStrategy}} has changed a bit and the rebalance ids don't always match the expectations of the test, so that the 1-node rebalance is discarded instead:
> {noformat}
> 09:46:10,530 DEBUG (transport-thread-Test-NodeB-p54539-t3:[testCache]) [Test] Discarding rebalance command CacheTopology{id=9, phase=TRANSITORY, rebalanceId=5, currentCH=ScatteredConsistentHash{ns=256, rebalanced=false, owners = (1)[Test-NodeB-62039: 85]}, pendingCH=ScatteredConsistentHash{ns=256, rebalanced=true, owners = (1)[Test-NodeB-62039: 256]}, unionCH=null, actualMembers=[Test-NodeB-62039], persistentUUIDs=[0ed7be74-4485-489b-baee-28c461c9e5de]}
> {noformat}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 6 months
[JBoss JIRA] (ISPN-9291) BasePartitionHandlingTest.Partition.installMergeView() doesn't compute the merge digest
by Tristan Tarrant (Jira)
[ https://issues.jboss.org/browse/ISPN-9291?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-9291:
----------------------------------
Priority: Minor (was: Major)
> BasePartitionHandlingTest.Partition.installMergeView() doesn't compute the merge digest
> ---------------------------------------------------------------------------------------
>
> Key: ISPN-9291
> URL: https://issues.jboss.org/browse/ISPN-9291
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 9.3.0.CR1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Minor
> Labels: testsuite_stability
> Fix For: 9.4.1.Final
>
>
> The partition handling tests use {{BasePartitionHandlingTest.Partition.installMergeView(view1, view2)}} to install the merge view without waiting for {{MERGE3}} to run, making them much faster. Unfortunately, the implementation is incorrect: {{GMS.installView(view)}} only works for regular views, merge views need to be installed with {{GMS.installView(mergeView, digest)}}.
> The result is that the nodes that got isolated from the coordinator request the retransmission of all the {{NAKACK2}} messages (including view updates) since the cluster first started. The isolated nodes cannot install the merge view until they deliver all the older messages (even without knowing whether they're OOB or not). But if {{STABLE}} ran and cleared a range of messages already, the retransmission request cannot be satisfied, so the view updates will never be delivered.
> This is easily reproducible in {{CrashedNodeDuringConflictResolutionTest}} if we add a delay before updating the topology in {{StateConsumerImpl}}. The test installs the merge view manually, but then kills NodeC and expects the cluster to install the new view automatically. NodeD can't install the new view because it's waiting for earlier messages from NodeA:
> {noformat}
> 18:27:13,054 INFO (testng-test:[]) [TestSuiteProgress] Test starting: org.infinispan.conflict.impl.CrashedNodeDuringConflictResolutionTest.testPartitionMergePolicy[DIST_SYNC]
> 18:27:13,640 DEBUG (testng-test:[]) [GMS] test-NodeA-39513: installing view MergeView::[test-NodeA-39513|10] (4) [test-NodeA-39513, test-NodeB-9439, test-NodeC-43706, test-NodeD-59078], 2 subgroups: [test-NodeA-39513|8] (2) [test-NodeA-39513, test-NodeB-9439], [test-NodeC-43706|9] (2) [test-NodeC-43706, test-NodeD-59078]
> 18:27:13,674 DEBUG (testng-test:[]) [GMS] test-NodeD-59078: installing view MergeView::[test-NodeA-39513|10] (4) [test-NodeA-39513, test-NodeB-9439, test-NodeC-43706, test-NodeD-59078], 2 subgroups: [test-NodeA-39513|8] (2) [test-NodeA-39513, test-NodeB-9439], [test-NodeC-43706|9] (2) [test-NodeC-43706, test-NodeD-59078]
> 18:27:13,828 TRACE (jgroups-7,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((1): {50}) to test-NodeA-39513
> 18:27:13,966 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((49): {1-49}) to test-NodeA-39513
> 18:27:14,067 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> 18:27:14,504 DEBUG (testng-test:[]) [DefaultCacheManager] Stopping cache manager ISPN on test-NodeC-43706
> 18:27:18,642 TRACE (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [GMS] test-NodeA-39513: joiners=[], suspected=[test-NodeC-43706], leaving=[], new view: [test-NodeA-39513|11] (3) [test-NodeA-39513, test-NodeB-9439, test-NodeD-59078]
> 18:27:18,643 TRACE (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [GMS] test-NodeA-39513: mcasting view [test-NodeA-39513|11] (3) [test-NodeA-39513, test-NodeB-9439, test-NodeD-59078]
> 18:27:18,646 DEBUG (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [GMS] test-NodeA-39513: installing view [test-NodeA-39513|11] (3) [test-NodeA-39513, test-NodeB-9439, test-NodeD-59078]
> 18:27:18,652 TRACE (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [TCP_NIO2] test-NodeA-39513: sending msg to null, src=test-NodeA-39513, headers are GMS: GmsHeader[VIEW], NAKACK2: [MSG, seqno=63], TP: [cluster_name=ISPN]
> 18:27:18,656 TRACE (jgroups-20,test-NodeA-39513:[]) [TCP_NIO2] test-NodeA-39513: received [dst: test-NodeA-39513, src: test-NodeB-9439 (3 headers), size=0 bytes, flags=OOB|INTERNAL], headers are GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=100, TP: [cluster_name=ISPN]
> 18:27:20,554 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> 18:27:20,653 WARN (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [GMS] test-NodeA-39513: failed to collect all ACKs (expected=2) for view [test-NodeA-39513|11] after 2000ms, missing 1 ACKs from (1) test-NodeD-59078
> 18:27:20,656 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> 18:27:20,756 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> ...
> 18:28:14,412 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> 18:28:14,513 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> 18:28:14,589 ERROR (testng-test:[]) [TestSuiteProgress] Test failed: org.infinispan.conflict.impl.CrashedNodeDuringConflictResolutionTest.testPartitionMergePolicy[DIST_SYNC]
> java.lang.RuntimeException: Cache ___defaultcache timed out waiting for rebalancing to complete on node test-NodeA-39513, current topology is CacheTopology{id=21, phase=CONFLICT_RESOLUTION, rebalanceId=7, currentCH=PartitionerConsistentHash:DefaultConsistentHash{ns=256, owners = (3)[test-NodeD-59078: 256+0, test-NodeA-39513: 0+256, test-NodeB-9439: 0+256]}, pendingCH=null, unionCH=null, actualMembers=[test-NodeD-59078, test-NodeA-39513, test-NodeB-9439], persistentUUIDs=[828108c4-4251-49fc-9481-ff6392bea9fb, 1d4b6f07-b71b-41a1-adfb-abbe68944a9f, 3a1ece05-c282-433e-9eb5-7b3e0f1932aa]}. rebalanceInProgress=true, currentChIsBalanced=true
> at org.infinispan.test.TestingUtil.waitForNoRebalance(TestingUtil.java:392) ~[test-classes/:?]
> at org.infinispan.conflict.impl.CrashedNodeDuringConflictResolutionTest.performMerge(CrashedNodeDuringConflictResolutionTest.java:113) ~[test-classes/:?]
> at org.infinispan.conflict.impl.BaseMergePolicyTest.testPartitionMergePolicy(BaseMergePolicyTest.java:137) ~[test-classes/:?]
> {noformat}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 6 months
[JBoss JIRA] (ISPN-7889) BaseDistributionInterceptor.remoteGet may cause concurrency issues
by Tristan Tarrant (Jira)
[ https://issues.jboss.org/browse/ISPN-7889?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-7889:
----------------------------------
Sprint: Sprint 10.0.0.Alpha0
> BaseDistributionInterceptor.remoteGet may cause concurrency issues
> ------------------------------------------------------------------
>
> Key: ISPN-7889
> URL: https://issues.jboss.org/browse/ISPN-7889
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.1.0.Alpha1
> Reporter: Radim Vansa
> Assignee: Dan Berindei
> Priority: Major
>
> {{BaseDistributionInterceptor.remoteGet}} or any call that accesses the context in an async future handler that is called multiple times in parallel may lead to concurrent modifications of the context.
> These calls are usually handled using {{CompletableFuture.allOf()}} or using a CF with counter, but if one of the calls results in exceptional completion of the composed future, the processing continues (e.g. with a retry). The other parallel operation handlers are not stopped, though.
> {{BaseDistributionInterceptor.remoteGet}} shouldn't be called in parallel because it does not even synchronize regular successful invocations.
> A problem like this caused failures in {{GetAllCommandStressTest}}, and the issue was addressed for {{GetAllCommand}} in ISPN-7884.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 6 months
[JBoss JIRA] (ISPN-8234) Add support for @Spatial, @Latitude, @Longitude annotations in protobuf schema
by Tristan Tarrant (Jira)
[ https://issues.jboss.org/browse/ISPN-8234?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-8234:
----------------------------------
Sprint: Sprint 9.3.0.Beta1, Sprint 9.3.0.CR1, Sprint 9.4.0.Beta1, Sprint 9.4.0.CR1, Sprint 9.4.0.CR3, Sprint 10.0.0.Alpha1, Sprint 9.4.0.Final (was: Sprint 9.3.0.Beta1, Sprint 9.3.0.CR1, Sprint 9.4.0.Beta1, Sprint 9.4.0.CR1, Sprint 9.4.0.CR3, Sprint 9.4.0.Final, Sprint 10.0.0.Alpha0)
> Add support for @Spatial, @Latitude, @Longitude annotations in protobuf schema
> ------------------------------------------------------------------------------
>
> Key: ISPN-8234
> URL: https://issues.jboss.org/browse/ISPN-8234
> Project: Infinispan
> Issue Type: Feature Request
> Components: Remote Querying
> Reporter: Adrian Nistor
> Assignee: Adrian Nistor
> Priority: Major
>
> These annotations must behave identically to their Hibernate Search counterparts defined in org.hibernate.search.annotations package.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 6 months