[infinispan-issues] [JBoss JIRA] (ISPN-1995) Uneven request balancing after node restore

Friday, 4 May 2012

    [
https://issues.jboss.org/browse/ISPN-1995?page=com.atlassian.jira.plugin....
] 

RH Bugzilla Integration commented on ISPN-1995:
-----------------------------------------------

Michal Linhard <mlinhard(a)redhat.com&gt; made a comment on [bug
809631|https://bugzilla.redhat.com/show_bug.cgi?id=809631]

OK. I've reran in hyperion, reproduced again:
http://www.qa.jboss.com/~mlinhard/hyperion/run85-resi-dist-16/report/stat...

this time with some log analysis:
http://www.qa.jboss.com/~mlinhard/hyperion/run85-resi-dist-16/report/loga...

Config: 16 nodes, DIST mode, numOwners 3, crashing 2 nodes

The steps of the resilience test is as follows:

1. Start complete cluster node0001 - node0016, wait till it forms (View 6 around
02:44:32,594)
2. Wait 5 min
3. Kill node0002, node0003, wait till survivor cluster forms (View 7 around 02:50:54,200)
4. Wait 5 min
5. Restore node0002, node0003, wait till complete cluster forms again (View 9 around
02:56:10,164)

These views are created:
http://www.qa.jboss.com/~mlinhard/hyperion/run85-resi-dist-16/report/loga...

There are two anomalies during the test:

2 clients (333 and 459) when the nodes are killed they remove them from topology, add them
and remove again within few seconds. (that's why we're seeing 502 node adds and
removes in the client logs even though there are only 500 clients)

After the two nodes were restored the nodes start to obtain the topology information
first about the node0003 (starting around 02:55:57,302) and then about node0002 (starting
around 02:56:03,581)

However in 165 cases the clients don't obtain the information about node0002 being
added which is 33% of the nodes, 
which corresponds to the load (throughput) of the node0002 being cca 33% lower than of
other nodes (in the throughput chart)

...
 Uneven request balancing after node restore
 -------------------------------------------

                 Key: ISPN-1995
                 URL: https://issues.jboss.org/browse/ISPN-1995
             Project: Infinispan
          Issue Type: Bug
          Components: Cache Server
    Affects Versions: 5.1.4.CR1
            Reporter: Tristan Tarrant
            Assignee: Galder Zamarreño
             Fix For: 5.1.x, 5.2.0.ALPHA1, 5.2.0.FINAL

 After a node crashes and rejoins the cluster, it does not receive client load at the same
level as the other nodes.
 This issue does not affect data integrity and distribution in the cluster. 
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

[infinispan-issues] [JBoss JIRA] (ISPN-1995) Uneven request balancing after node restore