[JBoss JIRA] (ISPN-9588) JGroups fails to install new cluster view after coordinator leave
by Pedro Zapata Fernandez (Jira)
[ https://issues.jboss.org/browse/ISPN-9588?page=com.atlassian.jira.plugin.... ]
Pedro Zapata Fernandez updated ISPN-9588:
-----------------------------------------
Sprint: Sprint 10.0.0.Alpha1, Sprint 10.0.0.Alpha2, Sprint 10.0.0.Alpha0, Sprint 10.0.0.Beta1, DataGrid Sprint #31, DataGrid Sprint #32, DataGrid Sprint #33, DataGrid Sprint #34, DataGrid Sprint #35, DataGrid Sprint #36 (was: Sprint 10.0.0.Alpha1, Sprint 10.0.0.Alpha2, Sprint 10.0.0.Alpha0, Sprint 10.0.0.Beta1, DataGrid Sprint #31, DataGrid Sprint #32, DataGrid Sprint #33, DataGrid Sprint #34, DataGrid Sprint #35)
> JGroups fails to install new cluster view after coordinator leave
> -----------------------------------------------------------------
>
> Key: ISPN-9588
> URL: https://issues.jboss.org/browse/ISPN-9588
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 9.4.0.CR3
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 9.4.7.Final, 10.0.0.Beta1
>
> Attachments: ISPN-9127_ScatteredCrashInSequenceTest-infinispan-core.log.gz, ISPN-9127_ThreeWaySplitAndMergeTest-infinispan-core.log.gz, ThreeWaySplitAndMergeTest.clearContent.log.gz
>
>
> The core test suite normally shuts down cache managers in the reverse order of their start, so that the coordinator stops last. This should be just an optimization, to reduce the number of coordinator changes, and with it the test suite duration and log size.
> Unfortunately, the few tests that stop the coordinator first are routinely failing to stop the cluster properly, usually when there are 2 nodes left in the cluster the 2nd node doesn't receive the view:
> {noformat}
> 10:08:01,656 DEBUG (jgroups-26,Test-NodeB-20661:[]) [GMS] Test-NodeB-20661: installing view [Test-NodeC-25266|33] (3) [Test-NodeC-25266, Test-NodeA-8936, Test-NodeB-20661]
> 10:08:01,862 TRACE (testng-Test:[]) [GMS] Test-NodeC-25266: sending LEAVE request to Test-NodeA-8936
> 10:08:01,863 DEBUG (jgroups-6,Test-NodeA-8936:[]) [GMS] Test-NodeA-8936: members are (3) Test-NodeC-25266,Test-NodeA-8936,Test-NodeB-20661, coord=Test-NodeA-8936: I'm the new coordinator
> 10:08:01,896 TRACE (jgroups-6,Test-NodeA-8936:[]) [GMS] Test-NodeA-8936: joiners=[], suspected=[], leaving=[Test-NodeC-25266], new view: [Test-NodeA-8936|34] (2) [Test-NodeA-8936, Test-NodeB-20661]
> 10:08:01,896 TRACE (jgroups-6,Test-NodeA-8936:[]) [GMS] Test-NodeA-8936: mcasting view [Test-NodeA-8936|34] (2) [Test-NodeA-8936, Test-NodeB-20661]
> 10:08:01,900 DEBUG (jgroups-6,Test-NodeA-8936:[]) [GMS] Test-NodeA-8936: installing view [Test-NodeA-8936|34] (2) [Test-NodeA-8936, Test-NodeB-20661]
> 10:08:02,245 TRACE (testng-Test:[]) [JGroupsTransport] Test-NodeB-20661 sending request 140 to Test-NodeC-25266: CacheTopologyControlCommand{cache=org.infinispan.CONFIG, type=LEAVE, sender=Test-NodeB-20661, joinInfo=null, topologyId=0, rebalanceId=0, currentCH=null, pendingCH=null, availabilityMode=null, phase=null, actualMembers=null, throwable=null, viewId=33}
> 10:12:02,247 DEBUG (testng-Test:[]) [LocalTopologyManagerImpl] Error sending the leave request for cache org.infinispan.CONFIG to coordinator
> org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 140 from 7d5d965d-eacc-fd47-2ab8-5cec95f4c10b
> {noformat}
> Sometimes the new coordinator gets stuck in a merge loop, keeps trying to re-include the stopped node and fails:
> {noformat}
> 10:06:21,484 DEBUG (jgroups-24,Test-NodeC-57941:[]) [GMS] Test-NodeC-57941: installing view [Test-NodeD-37696|8] (3) [Test-NodeD-37696, Test-NodeB-39137, Test-NodeC-57941]
> 10:06:21,486 DEBUG (jgroups-25,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: installing view [Test-NodeD-37696|8] (3) [Test-NodeD-37696, Test-NodeB-39137, Test-NodeC-57941]
> 10:06:22,070 TRACE (testng-Test:[]) [GMS] Test-NodeD-37696: sending LEAVE request to Test-NodeB-39137
> 10:06:22,078 TRACE (jgroups-10,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: mcasting view [Test-NodeB-39137|9] (2) [Test-NodeB-39137, Test-NodeC-57941]
> 10:06:22,080 DEBUG (jgroups-10,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: installing view [Test-NodeB-39137|9] (2) [Test-NodeB-39137, Test-NodeC-57941]
> 10:06:30,887 DEBUG (jgroups-24,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: I will be the merge leader. Starting the merge task. Views: {Test-NodeB-39137=[Test-NodeB-39137|9] (2) [Test-NodeB-39137, Test-NodeC-57941], Test-NodeC-57941=[Test-NodeD-37696|8] (3) [Test-NodeD-37696, Test-NodeB-39137, Test-NodeC-57941]}
> 10:06:30,888 DEBUG (MergeTask-108,Test-NodeB-39137:[]) [STABLE] suspending message garbage collection
> 10:06:30,888 DEBUG (MergeTask-108,Test-NodeB-39137:[]) [STABLE] Test-NodeB-39137: resume task started, max_suspend_time=220000
> 10:06:30,888 DEBUG (MergeTask-108,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge task Test-NodeB-39137::7 started with 2 participants
> 10:06:30,888 TRACE (MergeTask-108,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: sending MERGE_REQ to [Test-NodeD-37696, Test-NodeB-39137]
> 10:06:36,041 TRACE (MergeTask-108,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: collected 1 merge response(s) in 5153 ms
> 10:06:36,041 DEBUG (MergeTask-108,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:06:36,043 TRACE (MergeTask-108,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: consolidated view=MergeView::[Test-NodeB-39137|10] (2) [Test-NodeB-39137, Test-NodeC-57941], 1 subgroups: [Test-NodeB-39137|9] (2) [Test-NodeB-39137, Test-NodeC-57941]
> consolidated digest=Test-NodeB-39137: [15 (16)], Test-NodeC-57941: [5 (5)]
> 10:06:36,043 DEBUG (MergeTask-108,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: installing merge view [Test-NodeB-39137|10] (2 members) in 1 coords
> 10:06:36,043 DEBUG (MergeTask-108,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge Test-NodeB-39137::7 took 5155 ms
> 10:06:36,043 TRACE (jgroups-24,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: mcasting view MergeView::[Test-NodeB-39137|10] (2) [Test-NodeB-39137, Test-NodeC-57941], 1 subgroups: [Test-NodeB-39137|9] (2) [Test-NodeB-39137, Test-NodeC-57941]
> 10:06:49,877 DEBUG (MergeTask-229,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:07:03,755 DEBUG (MergeTask-342,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:07:17,703 DEBUG (MergeTask-447,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:07:31,547 DEBUG (MergeTask-568,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:07:45,381 DEBUG (MergeTask-688,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:07:59,196 DEBUG (MergeTask-806,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:08:13,031 DEBUG (MergeTask-932,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:08:26,885 DEBUG (MergeTask-1061,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:08:40,697 DEBUG (MergeTask-1197,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:08:54,510 DEBUG (MergeTask-1330,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:09:08,324 DEBUG (MergeTask-1463,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:09:22,135 DEBUG (MergeTask-1596,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:09:35,953 DEBUG (MergeTask-1729,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:09:49,770 DEBUG (MergeTask-1857,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:10:03,609 DEBUG (MergeTask-1986,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:10:17,420 DEBUG (MergeTask-2117,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:10:31,252 DEBUG (MergeTask-2248,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:10:45,099 DEBUG (MergeTask-2383,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:10:58,916 DEBUG (MergeTask-2516,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> 10:11:12,743 DEBUG (MergeTask-2648,Test-NodeB-39137:[]) [GMS] Test-NodeB-39137: merge leader Test-NodeB-39137 did not get responses from all 2 partition coordinators; missing responses from 1 members, removing them from the merge
> {noformat}
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 8 months
[JBoss JIRA] (ISPN-9703) CI should test that distribution build works
by Pedro Zapata Fernandez (Jira)
[ https://issues.jboss.org/browse/ISPN-9703?page=com.atlassian.jira.plugin.... ]
Pedro Zapata Fernandez updated ISPN-9703:
-----------------------------------------
Sprint: Sprint 10.0.0.Alpha1, Sprint 10.0.0.Alpha2, Sprint 10.0.0.Beta1, DataGrid Sprint #31, DataGrid Sprint #32, DataGrid Sprint #33, DataGrid Sprint #34, DataGrid Sprint #35, DataGrid Sprint #36 (was: Sprint 10.0.0.Alpha1, Sprint 10.0.0.Alpha2, Sprint 10.0.0.Beta1, DataGrid Sprint #31, DataGrid Sprint #32, DataGrid Sprint #33, DataGrid Sprint #34, DataGrid Sprint #35)
> CI should test that distribution build works
> --------------------------------------------
>
> Key: ISPN-9703
> URL: https://issues.jboss.org/browse/ISPN-9703
> Project: Infinispan
> Issue Type: Enhancement
> Affects Versions: 9.4.1.Final
> Reporter: Ryan Emerson
> Assignee: Ryan Emerson
> Priority: Major
> Fix For: 10.0.0.Alpha3
>
>
> Currently the CI does not test whether the `-Pdistribution` profile successfully builds or not. As we want master to always be releasable, we should test this on PRs.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 8 months
[JBoss JIRA] (ISPN-9541) Module initialization is not thread-safe
by Pedro Zapata Fernandez (Jira)
[ https://issues.jboss.org/browse/ISPN-9541?page=com.atlassian.jira.plugin.... ]
Pedro Zapata Fernandez updated ISPN-9541:
-----------------------------------------
Sprint: Sprint 10.0.0.Alpha1, Sprint 10.0.0.Alpha2, Sprint 9.4.0.Final, Sprint 10.0.0.Alpha0, Sprint 10.0.0.Beta1, DataGrid Sprint #31, DataGrid Sprint #32, DataGrid Sprint #33, DataGrid Sprint #34, DataGrid Sprint #35, DataGrid Sprint #36 (was: Sprint 10.0.0.Alpha1, Sprint 10.0.0.Alpha2, Sprint 9.4.0.Final, Sprint 10.0.0.Alpha0, Sprint 10.0.0.Beta1, DataGrid Sprint #31, DataGrid Sprint #32, DataGrid Sprint #33, DataGrid Sprint #34, DataGrid Sprint #35)
> Module initialization is not thread-safe
> ----------------------------------------
>
> Key: ISPN-9541
> URL: https://issues.jboss.org/browse/ISPN-9541
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Server
> Affects Versions: 9.4.0.CR3
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 10.0.0.CR3
>
>
> In my ISPN-9127 fix I created a {{BasicComponentRegistry}} interface that represents a mostly-read-only collection of components. It has {{replaceComponent()}} method and a {{rewire()}} method for testing purposes, but it turns out modules were also relying on the ability to replace existing components in order to work.
> Replacing global components is normally safe during the {{ModuleLifecycle.cacheManagerStarting()}}, because none of the components are started yet, so when a component starts later we can still start its dependencies first. But because some modules starts some global components, e.g. by calling {{manager.getCache(name)}}, that assumption breaks.
> The {{infinispan-server-event-logger}} module is a bit more sneaky: it doesn't replace a component, instead it replaces the actual implementation of the event logger in the {{EventLogManager}} component. Events that happen before the module's {{cacheManagerStarting()}} or after {{cacheManagerStopping()}} will be silently dropped from the persistent event log.
> I am investigating making the module a factory of factories. Instead of having a monolitic {{cacheManagerStarting()}} method, it could define a set of components that it can create, and a set of components that should be started before the cache manager is "running". We probably need a way to depend on other modules as well, maybe reusing the {{@Inject}} and {{@ComponentName}} annotations.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 8 months
[JBoss JIRA] (ISPN-7889) BaseDistributionInterceptor.remoteGet may cause concurrency issues
by Pedro Zapata Fernandez (Jira)
[ https://issues.jboss.org/browse/ISPN-7889?page=com.atlassian.jira.plugin.... ]
Pedro Zapata Fernandez updated ISPN-7889:
-----------------------------------------
Sprint: Sprint 10.0.0.Alpha1, Sprint 10.0.0.Alpha2, Sprint 10.0.0.Alpha0, Sprint 10.0.0.Beta1, DataGrid Sprint #31, DataGrid Sprint #32, DataGrid Sprint #33, DataGrid Sprint #34, DataGrid Sprint #35, DataGrid Sprint #36 (was: Sprint 10.0.0.Alpha1, Sprint 10.0.0.Alpha2, Sprint 10.0.0.Alpha0, Sprint 10.0.0.Beta1, DataGrid Sprint #31, DataGrid Sprint #32, DataGrid Sprint #33, DataGrid Sprint #34, DataGrid Sprint #35)
> BaseDistributionInterceptor.remoteGet may cause concurrency issues
> ------------------------------------------------------------------
>
> Key: ISPN-7889
> URL: https://issues.jboss.org/browse/ISPN-7889
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.1.0.Alpha1
> Reporter: Radim Vansa
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 10.0.0.Alpha3, 9.4.6.Final
>
>
> {{BaseDistributionInterceptor.remoteGet}} or any call that accesses the context in an async future handler that is called multiple times in parallel may lead to concurrent modifications of the context.
> These calls are usually handled using {{CompletableFuture.allOf()}} or using a CF with counter, but if one of the calls results in exceptional completion of the composed future, the processing continues (e.g. with a retry). The other parallel operation handlers are not stopped, though.
> {{BaseDistributionInterceptor.remoteGet}} shouldn't be called in parallel because it does not even synchronize regular successful invocations.
> A problem like this caused failures in {{GetAllCommandStressTest}}, and the issue was addressed for {{GetAllCommand}} in ISPN-7884.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 8 months
[JBoss JIRA] (ISPN-9291) BasePartitionHandlingTest.Partition.installMergeView() doesn't compute the merge digest
by Pedro Zapata Fernandez (Jira)
[ https://issues.jboss.org/browse/ISPN-9291?page=com.atlassian.jira.plugin.... ]
Pedro Zapata Fernandez updated ISPN-9291:
-----------------------------------------
Sprint: Sprint 9.4.0.Beta1, Sprint 9.4.0.CR1, Sprint 9.4.0.CR3, Sprint 10.0.0.Alpha1, Sprint 10.0.0.Alpha2, Sprint 9.4.0.Final, Sprint 10.0.0.Alpha0, Sprint 10.0.0.Beta1, DataGrid Sprint #31, DataGrid Sprint #32, DataGrid Sprint #33, DataGrid Sprint #34, DataGrid Sprint #35, DataGrid Sprint #36 (was: Sprint 9.4.0.Beta1, Sprint 9.4.0.CR1, Sprint 9.4.0.CR3, Sprint 10.0.0.Alpha1, Sprint 10.0.0.Alpha2, Sprint 9.4.0.Final, Sprint 10.0.0.Alpha0, Sprint 10.0.0.Beta1, DataGrid Sprint #31, DataGrid Sprint #32, DataGrid Sprint #33, DataGrid Sprint #34, DataGrid Sprint #35)
> BasePartitionHandlingTest.Partition.installMergeView() doesn't compute the merge digest
> ---------------------------------------------------------------------------------------
>
> Key: ISPN-9291
> URL: https://issues.jboss.org/browse/ISPN-9291
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 9.3.0.CR1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Minor
> Labels: testsuite_stability
> Fix For: 10.0.0.Beta4, 9.4.16.Final
>
>
> The partition handling tests use {{BasePartitionHandlingTest.Partition.installMergeView(view1, view2)}} to install the merge view without waiting for {{MERGE3}} to run, making them much faster. Unfortunately, the implementation is incorrect: {{GMS.installView(view)}} only works for regular views, merge views need to be installed with {{GMS.installView(mergeView, digest)}}.
> The result is that the nodes that got isolated from the coordinator request the retransmission of all the {{NAKACK2}} messages (including view updates) since the cluster first started. The isolated nodes cannot install the merge view until they deliver all the older messages (even without knowing whether they're OOB or not). But if {{STABLE}} ran and cleared a range of messages already, the retransmission request cannot be satisfied, so the view updates will never be delivered.
> This is easily reproducible in {{CrashedNodeDuringConflictResolutionTest}} if we add a delay before updating the topology in {{StateConsumerImpl}}. The test installs the merge view manually, but then kills NodeC and expects the cluster to install the new view automatically. NodeD can't install the new view because it's waiting for earlier messages from NodeA:
> {noformat}
> 18:27:13,054 INFO (testng-test:[]) [TestSuiteProgress] Test starting: org.infinispan.conflict.impl.CrashedNodeDuringConflictResolutionTest.testPartitionMergePolicy[DIST_SYNC]
> 18:27:13,640 DEBUG (testng-test:[]) [GMS] test-NodeA-39513: installing view MergeView::[test-NodeA-39513|10] (4) [test-NodeA-39513, test-NodeB-9439, test-NodeC-43706, test-NodeD-59078], 2 subgroups: [test-NodeA-39513|8] (2) [test-NodeA-39513, test-NodeB-9439], [test-NodeC-43706|9] (2) [test-NodeC-43706, test-NodeD-59078]
> 18:27:13,674 DEBUG (testng-test:[]) [GMS] test-NodeD-59078: installing view MergeView::[test-NodeA-39513|10] (4) [test-NodeA-39513, test-NodeB-9439, test-NodeC-43706, test-NodeD-59078], 2 subgroups: [test-NodeA-39513|8] (2) [test-NodeA-39513, test-NodeB-9439], [test-NodeC-43706|9] (2) [test-NodeC-43706, test-NodeD-59078]
> 18:27:13,828 TRACE (jgroups-7,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((1): {50}) to test-NodeA-39513
> 18:27:13,966 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((49): {1-49}) to test-NodeA-39513
> 18:27:14,067 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> 18:27:14,504 DEBUG (testng-test:[]) [DefaultCacheManager] Stopping cache manager ISPN on test-NodeC-43706
> 18:27:18,642 TRACE (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [GMS] test-NodeA-39513: joiners=[], suspected=[test-NodeC-43706], leaving=[], new view: [test-NodeA-39513|11] (3) [test-NodeA-39513, test-NodeB-9439, test-NodeD-59078]
> 18:27:18,643 TRACE (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [GMS] test-NodeA-39513: mcasting view [test-NodeA-39513|11] (3) [test-NodeA-39513, test-NodeB-9439, test-NodeD-59078]
> 18:27:18,646 DEBUG (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [GMS] test-NodeA-39513: installing view [test-NodeA-39513|11] (3) [test-NodeA-39513, test-NodeB-9439, test-NodeD-59078]
> 18:27:18,652 TRACE (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [TCP_NIO2] test-NodeA-39513: sending msg to null, src=test-NodeA-39513, headers are GMS: GmsHeader[VIEW], NAKACK2: [MSG, seqno=63], TP: [cluster_name=ISPN]
> 18:27:18,656 TRACE (jgroups-20,test-NodeA-39513:[]) [TCP_NIO2] test-NodeA-39513: received [dst: test-NodeA-39513, src: test-NodeB-9439 (3 headers), size=0 bytes, flags=OOB|INTERNAL], headers are GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=100, TP: [cluster_name=ISPN]
> 18:27:20,554 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> 18:27:20,653 WARN (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [GMS] test-NodeA-39513: failed to collect all ACKs (expected=2) for view [test-NodeA-39513|11] after 2000ms, missing 1 ACKs from (1) test-NodeD-59078
> 18:27:20,656 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> 18:27:20,756 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> ...
> 18:28:14,412 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> 18:28:14,513 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> 18:28:14,589 ERROR (testng-test:[]) [TestSuiteProgress] Test failed: org.infinispan.conflict.impl.CrashedNodeDuringConflictResolutionTest.testPartitionMergePolicy[DIST_SYNC]
> java.lang.RuntimeException: Cache ___defaultcache timed out waiting for rebalancing to complete on node test-NodeA-39513, current topology is CacheTopology{id=21, phase=CONFLICT_RESOLUTION, rebalanceId=7, currentCH=PartitionerConsistentHash:DefaultConsistentHash{ns=256, owners = (3)[test-NodeD-59078: 256+0, test-NodeA-39513: 0+256, test-NodeB-9439: 0+256]}, pendingCH=null, unionCH=null, actualMembers=[test-NodeD-59078, test-NodeA-39513, test-NodeB-9439], persistentUUIDs=[828108c4-4251-49fc-9481-ff6392bea9fb, 1d4b6f07-b71b-41a1-adfb-abbe68944a9f, 3a1ece05-c282-433e-9eb5-7b3e0f1932aa]}. rebalanceInProgress=true, currentChIsBalanced=true
> at org.infinispan.test.TestingUtil.waitForNoRebalance(TestingUtil.java:392) ~[test-classes/:?]
> at org.infinispan.conflict.impl.CrashedNodeDuringConflictResolutionTest.performMerge(CrashedNodeDuringConflictResolutionTest.java:113) ~[test-classes/:?]
> at org.infinispan.conflict.impl.BaseMergePolicyTest.testPartitionMergePolicy(BaseMergePolicyTest.java:137) ~[test-classes/:?]
> {noformat}
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 8 months
[JBoss JIRA] (ISPN-9847) Extend configuration to allow inline JGroups configuration and inheritance
by Pedro Zapata Fernandez (Jira)
[ https://issues.jboss.org/browse/ISPN-9847?page=com.atlassian.jira.plugin.... ]
Pedro Zapata Fernandez updated ISPN-9847:
-----------------------------------------
Sprint: Sprint 10.0.0.Beta1, DataGrid Sprint #31, DataGrid Sprint #32, DataGrid Sprint #33, DataGrid Sprint #34, DataGrid Sprint #35, DataGrid Sprint #36 (was: Sprint 10.0.0.Beta1, DataGrid Sprint #31, DataGrid Sprint #32, DataGrid Sprint #33, DataGrid Sprint #34, DataGrid Sprint #35)
> Extend configuration to allow inline JGroups configuration and inheritance
> --------------------------------------------------------------------------
>
> Key: ISPN-9847
> URL: https://issues.jboss.org/browse/ISPN-9847
> Project: Infinispan
> Issue Type: Enhancement
> Components: Configuration, Core
> Affects Versions: 10.0.0.Alpha2
> Reporter: Tristan Tarrant
> Assignee: Tristan Tarrant
> Priority: Major
> Fix For: 10.0.0.Beta1, 10.0.0.Final
>
>
> Allowing inline JGroups configurations to increase usability.
> Also support the following extra features:
> * stack inheritance with protocol replacement/override
> * easier xsite configuration
> {code:xml}
> <jgroups transport="org.infinispan.remoting.transport.jgroups.JGroupsTransport">
> <!-- Load external JGroups stacks -->
> <stack-file name="udp" path="stacks/udp.xml"/>
> <stack-file name="tcp" path="stacks/tcp.xml"/>
> <!-- Inline definition -->
> <stack name="mping">
> <TCP bind_port="7800" port_range="30" recv_buf_size="20000000" send_buf_size="640000"
> sock_conn_timeout="300" bundler_type="no-bundler"
> thread_pool.min_threads="0" thread_pool.max_threads="25" thread_pool.keep_alive_time="5000"/>
> <MPING bind_addr="127.0.0.1" break_on_coord_rsp="true"
> mcast_addr="${jgroups.mping.mcast_addr:228.2.4.6}"
> mcast_port="${jgroups.mping.mcast_port:43366}"
> ip_ttl="${jgroups.udp.ip_ttl:2}"/>
> <MERGE3/>
> <FD_SOCK/>
> <FD_ALL timeout="3000"
> interval="1000"
> timeout_check_interval="1000"
> />
> <VERIFY_SUSPECT timeout="1000"/>
> <pbcast.NAKACK2
> use_mcast_xmit="false"
> xmit_interval="100"
> xmit_table_num_rows="50"
> xmit_table_msgs_per_row="1024"
> xmit_table_max_compaction_time="30000"/>
> <UNICAST3
> xmit_interval="100"
> xmit_table_num_rows="50"
> xmit_table_msgs_per_row="1024"
> xmit_table_max_compaction_time="30000"
> />
> <RSVP />
> <pbcast.STABLE stability_delay="200"
> desired_avg_gossip="2000"
> max_bytes="1M"
> />
> <pbcast.GMS print_local_addr="false"
> join_timeout="${jgroups.join_timeout:2000}"/>
> <MFC max_credits="2M" min_threshold="0.40"/>
> <FRAG3/>
> </stack>
> <!-- Use the "tcp" stack but override some protocol attributes -->
> <stack name="mytcp" extends="tcp">
> <MERGE3 max_interval="20000"/>
> </stack>
> <!-- Use the "tcp" stack but replace the discovery -->
> <stack name="tcpgossip" extends="tcp">
> <TCPGOSSIP initial_hosts="${jgroups.tunnel.gossip_router_hosts:localhost[12001]}" stack.combine="replace:TCP_NIO2"/>
> </stack>
> <!-- Add a relay configuration using a previously declared stack to talk to the remote site -->
> <stack name="xsite" extends="udp">
> <relay site="LON">
> <remote-site name="NYC" stack="tcpgossip"/>
> </relay>
> </stack>
> </jgroups>
> {code}
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 8 months
[JBoss JIRA] (ISPN-10435) Support for serving static content
by Pedro Zapata Fernandez (Jira)
[ https://issues.jboss.org/browse/ISPN-10435?page=com.atlassian.jira.plugin... ]
Pedro Zapata Fernandez updated ISPN-10435:
------------------------------------------
Sprint: DataGrid Sprint #33, DataGrid Sprint #34, DataGrid Sprint #35, DataGrid Sprint #36 (was: DataGrid Sprint #33, DataGrid Sprint #34, DataGrid Sprint #35)
> Support for serving static content
> ----------------------------------
>
> Key: ISPN-10435
> URL: https://issues.jboss.org/browse/ISPN-10435
> Project: Infinispan
> Issue Type: Feature Request
> Components: REST
> Reporter: Gustavo Fernandes
> Assignee: Gustavo Fernandes
> Priority: Major
> Labels: console-ng
> Fix For: 10.0.0.CR2
>
>
> The REST endpoint should be able to serve the console from a static folder, respecting usual http headers sent by browsers to facilitate caching such as if-modified-since
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 8 months
[JBoss JIRA] (ISPN-10571) CORS usability issues
by Pedro Zapata Fernandez (Jira)
[ https://issues.jboss.org/browse/ISPN-10571?page=com.atlassian.jira.plugin... ]
Pedro Zapata Fernandez updated ISPN-10571:
------------------------------------------
Sprint: DataGrid Sprint #33, DataGrid Sprint #34, DataGrid Sprint #35, DataGrid Sprint #36 (was: DataGrid Sprint #33, DataGrid Sprint #34, DataGrid Sprint #35)
> CORS usability issues
> ---------------------
>
> Key: ISPN-10571
> URL: https://issues.jboss.org/browse/ISPN-10571
> Project: Infinispan
> Issue Type: Bug
> Components: REST
> Affects Versions: 10.0.0.CR1
> Reporter: Gustavo Fernandes
> Assignee: Gustavo Fernandes
> Priority: Major
> Labels: console-ng
> Fix For: 10.0.0.CR2
>
>
> * OPTIONS is currently only handled by CORS for the pre-flight. Regular OPTIONS requests throw an error.
> * By default, localhost is not allowed
> * Specifying origin with '*' (all origins allowed) is not working
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 8 months
[JBoss JIRA] (ISPN-10580) Cluster resource
by Pedro Zapata Fernandez (Jira)
[ https://issues.jboss.org/browse/ISPN-10580?page=com.atlassian.jira.plugin... ]
Pedro Zapata Fernandez updated ISPN-10580:
------------------------------------------
Sprint: DataGrid Sprint #34, DataGrid Sprint #35, DataGrid Sprint #36 (was: DataGrid Sprint #34, DataGrid Sprint #35)
> Cluster resource
> ----------------
>
> Key: ISPN-10580
> URL: https://issues.jboss.org/browse/ISPN-10580
> Project: Infinispan
> Issue Type: Feature Request
> Components: REST
> Reporter: Gustavo Fernandes
> Assignee: Katia Aresti
> Priority: Major
> Labels: console-ng
>
> A resource to expose cluster-wide info and operations:
> * Shutdown the whole cluster
> * Health of all cache managers from the server
> * List of nodes
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 8 months
[JBoss JIRA] (ISPN-10552) Ignored caches should be per server
by Pedro Zapata Fernandez (Jira)
[ https://issues.jboss.org/browse/ISPN-10552?page=com.atlassian.jira.plugin... ]
Pedro Zapata Fernandez updated ISPN-10552:
------------------------------------------
Sprint: DataGrid Sprint #34, DataGrid Sprint #35, DataGrid Sprint #36 (was: DataGrid Sprint #34, DataGrid Sprint #35)
> Ignored caches should be per server
> -----------------------------------
>
> Key: ISPN-10552
> URL: https://issues.jboss.org/browse/ISPN-10552
> Project: Infinispan
> Issue Type: Bug
> Components: Server
> Reporter: Gustavo Fernandes
> Assignee: Gustavo Fernandes
> Priority: Major
> Labels: console-ng, rest
> Fix For: 10.0.0.CR3
>
>
> The new server has ignored caches as a configuration for each endopoint, it should not be a configuration but a runtime persisted state that affects all the cluster
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 8 months