[
https://issues.jboss.org/browse/MODCLUSTER-322?page=com.atlassian.jira.pl...
]
Paul Ferraro commented on MODCLUSTER-322:
-----------------------------------------
According to the spec, OperatingSystemMXBean.getSystemLoadAverage() returns:
Returns the system load average for the last minute. The system load average is the sum of
the number of runnable entities queued to the available processors and the number of
runnable entities running on the available processors averaged over a period of
time."
So, it would appear that mod_cluster's AverageSystemLoadMetric needs to divide this
number by the value returned by OperatingSystemMXBean.getAvailableProcessors().
Using AverageSystemLoadMetric can improperly cause a Load Factor of
0
---------------------------------------------------------------------
Key: MODCLUSTER-322
URL:
https://issues.jboss.org/browse/MODCLUSTER-322
Project: mod_cluster
Issue Type: Bug
Affects Versions: MOD_CLUSTER_1_0_10_GA_CP02, 1.2.1.Final
Environment: *JBoss Enterprise Application Platform 5
*Apache httpd
*mod_cluster 1.0.10.GA_CP02 or 1.2.1
Reporter: Aaron Ogburn
Assignee: Paul Ferraro
Fix For: MOD_CLUSTER_1_0_10_GA_CP03, 1.2.2.Beta1
It looks like AverageSystemLoadMetric is not properly implemented. When using it,
mod_cluster may always report a load factor of 0, thus making the JBoss node unreachable
from Apache. We've tested with a simple web app that checks the underlying MXBean
SystemLoadAverage:
Double.class.cast(server.getAttribute(ObjectName.getInstance(ManagementFactory.OPERATING_SYSTEM_MXBEAN_NAME),
"SystemLoadAverage")).doubleValue()
This info is grabbed pretty much the same way mod_cluster does, but these calls appeared
to work just fine outside of mod_cluster as it returns the following values from my test
app:
12:08:26,519 INFO [STDOUT] From MBeanServer: 1.81640625
So the issue does not appear to be necessarily with the underlying JDK/MXBean call but
with how mod_cluster is handling the data grabbed from it. But the root cause here
appears to be that a value above 1 is being returned, and it looks like mod_cluster is
expecting metrics to return a 0-1 percentile based range.
The way the load is determined allows the AverageSystemLoadMetric to improperly exceed
its weight. For example if it were weighted as 2 and another metric was weighted at 1 (say
RequestCountLoadMetric with a capacity of 1000), then AverageSystemLoadMetric should only
be able to account for 67% of the load. But here we can see AverageSystemLoadMetric can
out run its weight and really account for 100% of the load. So let's say
AverageSystemLoadMetric is the above seen 1.81640625 value with 100 requests/second,
putting RequestCountLoadMetric at .1 load, so DynamicLoadBalanceFactorProvider would
calculate the load factor like so:
int load = (int) Math.round(100 * totalWeightedLoad / totalWeight);
int load = (int) Math.round(100 * (1.81640625 * 2 + 0.1) / 3);
load = 124.4
And that gets truncated down to 100 so AverageSystemLoadMetric comes to represent really
all of the load. But if a 2 weight metric is at its max and a 1 weight metric is at .1%,
then their total load should just be ~70-71%.
S is mod_cluster assuming that the SystemLoadAverage will always be between 0 and 1?
Does it look like mod_cluster is not properly scaling this metric? Do we know a definite
max return to expect from OperatingSystemMXBean.getSystemLoadAverage() so that this metric
can be scaled more in line with the other percentile based ones? Or should a user
definable max capacity be implemented into this metric as it is with others?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see:
http://www.atlassian.com/software/jira