[mod_cluster-issues] [JBoss JIRA] (MODCLUSTER-322) Using AverageSystemLoadMetric can improperly cause a Load Factor of 0

Tue Jul 17 11:57:06 EDT 2012

    [ https://issues.jboss.org/browse/MODCLUSTER-322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12706406#comment-12706406 ] 

Paul Ferraro commented on MODCLUSTER-322:
-----------------------------------------

According to the spec, OperatingSystemMXBean.getSystemLoadAverage() returns:
Returns the system load average for the last minute. The system load average is the sum of the number of runnable entities queued to the available processors and the number of runnable entities running on the available processors averaged over a period of time."

So, it would appear that mod_cluster's AverageSystemLoadMetric needs to divide this number by the value returned by OperatingSystemMXBean.getAvailableProcessors().

> Using AverageSystemLoadMetric can improperly cause a Load Factor of 0
> ---------------------------------------------------------------------
>
>                 Key: MODCLUSTER-322
>                 URL: https://issues.jboss.org/browse/MODCLUSTER-322
>             Project: mod_cluster
>          Issue Type: Bug
>    Affects Versions: MOD_CLUSTER_1_0_10_GA_CP02, 1.2.1.Final
>         Environment: *JBoss Enterprise Application Platform 5
> *Apache httpd
> *mod_cluster 1.0.10.GA_CP02 or 1.2.1
>            Reporter: Aaron Ogburn
>            Assignee: Paul Ferraro
>             Fix For: MOD_CLUSTER_1_0_10_GA_CP03, 1.2.2.Beta1
>
>
> It looks like AverageSystemLoadMetric is not properly implemented.  When using it, mod_cluster may always report a load factor of 0, thus making the JBoss node unreachable from Apache.  We've tested with a simple web app that checks the underlying MXBean SystemLoadAverage:
> Double.class.cast(server.getAttribute(ObjectName.getInstance(ManagementFactory.OPERATING_SYSTEM_MXBEAN_NAME), "SystemLoadAverage")).doubleValue()
> This info is grabbed pretty much the same way mod_cluster does, but these calls appeared to work just fine outside of mod_cluster as it returns the following values from my test app:
> 12:08:26,519 INFO  [STDOUT] From MBeanServer: 1.81640625
> So the issue does not appear to be necessarily with the underlying JDK/MXBean call but with how mod_cluster is handling the data grabbed from it.  But the root cause here appears to be that a value above 1 is being returned, and it looks like mod_cluster is expecting metrics to return a 0-1 percentile based range.
> The way the load is determined allows the AverageSystemLoadMetric to improperly exceed its weight. For example if it were weighted as 2 and another metric was weighted at 1 (say RequestCountLoadMetric with a capacity of 1000), then AverageSystemLoadMetric should only be able to account for 67% of the load.  But here we can see AverageSystemLoadMetric can out run its weight and really account for 100% of the load.  So let's say AverageSystemLoadMetric is the above seen 1.81640625 value with 100 requests/second, putting RequestCountLoadMetric at .1 load, so DynamicLoadBalanceFactorProvider would calculate the load factor like so:
>         int load = (int) Math.round(100 * totalWeightedLoad / totalWeight);
>         int load = (int) Math.round(100 * (1.81640625 * 2 + 0.1) / 3);
>         load = 124.4
> And that gets truncated down to 100 so AverageSystemLoadMetric comes to represent really all of the load.  But if a 2 weight metric is at its max and a 1 weight metric is at .1%, then their total load should just be ~70-71%.
> S is mod_cluster assuming that the SystemLoadAverage will always be between 0 and 1?  Does it look like mod_cluster is not properly scaling this metric?  Do we know a definite max return to expect from OperatingSystemMXBean.getSystemLoadAverage() so that this metric can be scaled more in line with the other percentile based ones?  Or should a user definable max capacity be implemented into this metric as it is with others?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira