[JBoss JIRA] (ISPN-6239) InitialClusterSizeTest.testInitialClusterSizeFail random failures
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-6239?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-6239:
-------------------------------
Status: Pull Request Sent (was: Open)
Git Pull Request: https://github.com/infinispan/infinispan/pull/4030
> InitialClusterSizeTest.testInitialClusterSizeFail random failures
> -----------------------------------------------------------------
>
> Key: ISPN-6239
> URL: https://issues.jboss.org/browse/ISPN-6239
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 8.2.0.Beta2
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Labels: testsuite_failure
> Fix For: 8.2.0.CR1
>
>
> The test starts 3 nodes concurrently, but configures Infinispan to wait for a cluster of 4 nodes, and expects that the nodes fail to start in {{initialClusterTimeout}} + 1 second.
> However, because of a bug in {{TEST_PING}}, the first 2 nodes see each other as coordinator and send a {{JOIN}} request to each other, and it takes 3 seconds to recover and start the cluster properly.
> The bug in {{TEST_PING}} is actually a hack introduced for {{ISPN-5106}}. The problem was that the first node (A) to start would install a view with itself as the single node, but the second node to start (B) would start immediately, and the discovery request from B would reach B's {{TEST_PING}} before it saw the view. That way, B could choose itself as the coordinator based on the order of A's and B's UUIDs, and the cluster would start as 2 partitions. Since most of our tests actually remove {{MERGE3}} from the protocol stack, the partitions would never merge and the test would fail with a timeout.
> I fixed this in {{TEST_PING}} by assuming that the sender of the first discovery response is a coordinator, when there is a single response. This worked because all but a few tests start their managers sequentially, however it sometimes introduces this 3 seconds delay when nodes start in parallel.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
10 years, 1 month
[JBoss JIRA] (ISPN-6247) Speed up DistWriteSkewTest and its subclasses
by Dan Berindei (JIRA)
Dan Berindei created ISPN-6247:
----------------------------------
Summary: Speed up DistWriteSkewTest and its subclasses
Key: ISPN-6247
URL: https://issues.jboss.org/browse/ISPN-6247
Project: Infinispan
Issue Type: Task
Components: Test Suite - Core
Affects Versions: 8.2.0.Beta2
Reporter: Dan Berindei
Assignee: Dan Berindei
Fix For: 8.2.0.CR1
{{DistWriteSkewTest}} is annotated with {{@CleanupAfterMethod}}, so the cluster is recreated for each one of the test methods. Since there are a lot of test methods inherited from {{AbstractClusteredWriteSkewTest}}, the repeated cluster startup and shutdown add a lot of overhead.
Since the test doesn't start or stop nodes, it should be possible to start the cluster only once for the whole class, eventually using different key names if leftover keys turn out to be a problem.
The annotation is also used for {{ReplWriteSkewTest}}, {{DistL1WriteSkewTest}}, and {{DistTotalOrderWriteSkewTest}}. It should be possible to reuse the cluster for all of them.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
10 years, 1 month