[
https://issues.jboss.org/browse/ISPN-2750?page=com.atlassian.jira.plugin....
]
Michal Linhard commented on ISPN-2750:
--------------------------------------
I've traced resilience test runs 8-7-8 and 16-15-16 with 10 clients, but the chart
doesn't give me the same look as in 32-31-32 test,
at least I can't be sure for such small absolute values (under 20 ops/sec)
http://dev39.mw.lab.eng.bos.redhat.com/~mlinhard/hyperion3/run0039-resi-0...
http://dev39.mw.lab.eng.bos.redhat.com/~mlinhard/hyperion3/run0040-resi-1...
the original chart that showed the problem:
http://dev39.mw.lab.eng.bos.redhat.com/~mlinhard/hyperion3/run0036-resi-3...
had the higher throughput values (around 250 ops/sec per node)
In all cases I can see topology info updated in each client:
http://dev39.mw.lab.eng.bos.redhat.com/~mlinhard/hyperion3/run0039-resi-0...
http://dev39.mw.lab.eng.bos.redhat.com/~mlinhard/hyperion3/run0040-resi-1...
http://dev39.mw.lab.eng.bos.redhat.com/~mlinhard/hyperion3/run0036-resi-3...
the category "INFO New topology received Full Before" and "INFO New
topology received Full After" has an entry for each client thread.
In the 32-31-32 run where the problem manifests, all threads received the same topology
id=62 before crash and id=67 after rejoin.
Hmm, just checked the entry distribution in the failing test:
http://dev39.mw.lab.eng.bos.redhat.com/~mlinhard/hyperion3/run0036-resi-3...
vs the traced runs:
http://dev39.mw.lab.eng.bos.redhat.com/~mlinhard/hyperion3/run0039-resi-0...
http://dev39.mw.lab.eng.bos.redhat.com/~mlinhard/hyperion3/run0040-resi-1...
so it seems like the hotrod servers really are following the cache topology distribution
and its the cache topology itself that's weird.
the trace logs for 8-7-8 and 16-15-16 runs can be found here:
http://dev39.mw.lab.eng.bos.redhat.com/~mlinhard/hyperion3/run0039-resi-0...
http://dev39.mw.lab.eng.bos.redhat.com/~mlinhard/hyperion3/run0039-resi-0...
http://dev39.mw.lab.eng.bos.redhat.com/~mlinhard/hyperion3/run0040-resi-1...
http://dev39.mw.lab.eng.bos.redhat.com/~mlinhard/hyperion3/run0040-resi-1...
Uneven request balancing via hotrod
-----------------------------------
Key: ISPN-2750
URL:
https://issues.jboss.org/browse/ISPN-2750
Project: Infinispan
Issue Type: Bug
Components: Server
Affects Versions: 5.2.0.CR2
Reporter: Michal Linhard
Assignee: Dan Berindei
Fix For: 5.2.0.Final
The load sent to servers in the cluster isn't balanced
tried in 32 node resilience tests:
http://dev39.mw.lab.eng.bos.redhat.com/~mlinhard/hyperion3/run0035-resi-3...
http://dev39.mw.lab.eng.bos.redhat.com/~mlinhard/hyperion3/run0036-resi-3...
this differs from ISPN-2632 in that the load is unbalanced from the beginning of the
test.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:
http://www.atlassian.com/software/jira