[
https://issues.jboss.org/browse/ISPN-1995?page=com.atlassian.jira.plugin....
]
RH Bugzilla Integration commented on ISPN-1995:
-----------------------------------------------
Michal Linhard <mlinhard(a)redhat.com> made a comment on [bug
809631|https://bugzilla.redhat.com/show_bug.cgi?id=809631]
OK. I've reran in hyperion, reproduced again:
http://www.qa.jboss.com/~mlinhard/hyperion/run85-resi-dist-16/report/stat...
this time with some log analysis:
http://www.qa.jboss.com/~mlinhard/hyperion/run85-resi-dist-16/report/loga...
Config: 16 nodes, DIST mode, numOwners 3, crashing 2 nodes
The steps of the resilience test is as follows:
1. Start complete cluster node0001 - node0016, wait till it forms (View 6 around
02:44:32,594)
2. Wait 5 min
3. Kill node0002, node0003, wait till survivor cluster forms (View 7 around 02:50:54,200)
4. Wait 5 min
5. Restore node0002, node0003, wait till complete cluster forms again (View 9 around
02:56:10,164)
These views are created:
http://www.qa.jboss.com/~mlinhard/hyperion/run85-resi-dist-16/report/loga...
There are two anomalies during the test:
2 clients (333 and 459) when the nodes are killed they remove them from topology, add them
and remove again within few seconds. (that's why we're seeing 502 node adds and
removes in the client logs even though there are only 500 clients)
After the two nodes were restored the nodes start to obtain the topology information
first about the node0003 (starting around 02:55:57,302) and then about node0002 (starting
around 02:56:03,581)
However in 165 cases the clients don't obtain the information about node0002 being
added which is 33% of the nodes,
which corresponds to the load (throughput) of the node0002 being cca 33% lower than of
other nodes (in the throughput chart)
Uneven request balancing after node restore
-------------------------------------------
Key: ISPN-1995
URL:
https://issues.jboss.org/browse/ISPN-1995
Project: Infinispan
Issue Type: Bug
Components: Cache Server
Affects Versions: 5.1.4.CR1
Reporter: Tristan Tarrant
Assignee: Galder ZamarreƱo
Fix For: 5.1.x, 5.2.0.ALPHA1, 5.2.0.FINAL
After a node crashes and rejoins the cluster, it does not receive client load at the same
level as the other nodes.
This issue does not affect data integrity and distribution in the cluster.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see:
http://www.atlassian.com/software/jira