[mod_cluster-issues] [JBoss JIRA] (MODCLUSTER-322) Using AverageSystemLoadMetric can improperly cause a Load Factor of 0

Aaron Ogburn (JIRA) jira-events at lists.jboss.org
Wed Jun 26 16:06:21 EDT 2013


     [ https://issues.jboss.org/browse/MODCLUSTER-322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aaron Ogburn reopened MODCLUSTER-322:
-------------------------------------



Reopening this as it looks like we made a mistake fixing this for EAP 5/mod_cluster 1.0.10.GA_CP.

Looks like the fix was committed to trunk, the 1.0.x branch, and the 1.0.10.GA_CP03 tag directly.  When we tagged 1.0.10.GA_CP04, we tagged it from the MOD_CLUSTER_1_0_10_GA_CP branch.  This branch didn't have the fix and so CP04 and EAP 5.2.0 lost this fix.  Fix needs to be added to the MOD_CLUSTER_1_0_10_GA_CP branch for future releases.  Put in the following pull request for that:

https://github.com/modcluster/mod_cluster/pull/34/
                
> Using AverageSystemLoadMetric can improperly cause a Load Factor of 0
> ---------------------------------------------------------------------
>
>                 Key: MODCLUSTER-322
>                 URL: https://issues.jboss.org/browse/MODCLUSTER-322
>             Project: mod_cluster
>          Issue Type: Bug
>    Affects Versions: MOD_CLUSTER_1_0_10_GA_CP02, 1.2.1.Final
>         Environment: *JBoss Enterprise Application Platform 5
> *Apache httpd
> *mod_cluster 1.0.10.GA_CP02 or 1.2.1
>            Reporter: Aaron Ogburn
>            Assignee: Paul Ferraro
>             Fix For: MOD_CLUSTER_1_0_10_GA_CP03, 1.2.2.Final
>
>
> It looks like AverageSystemLoadMetric is not properly implemented.  When using it, mod_cluster may always report a load factor of 0, thus making the JBoss node unreachable from Apache.  We've tested with a simple web app that checks the underlying MXBean SystemLoadAverage:
> Double.class.cast(server.getAttribute(ObjectName.getInstance(ManagementFactory.OPERATING_SYSTEM_MXBEAN_NAME), "SystemLoadAverage")).doubleValue()
> This info is grabbed pretty much the same way mod_cluster does, but these calls appeared to work just fine outside of mod_cluster as it returns the following values from my test app:
> 12:08:26,519 INFO  [STDOUT] From MBeanServer: 1.81640625
> So the issue does not appear to be necessarily with the underlying JDK/MXBean call but with how mod_cluster is handling the data grabbed from it.  But the root cause here appears to be that a value above 1 is being returned, and it looks like mod_cluster is expecting metrics to return a 0-1 percentile based range.
> The way the load is determined allows the AverageSystemLoadMetric to improperly exceed its weight. For example if it were weighted as 2 and another metric was weighted at 1 (say RequestCountLoadMetric with a capacity of 1000), then AverageSystemLoadMetric should only be able to account for 67% of the load.  But here we can see AverageSystemLoadMetric can out run its weight and really account for 100% of the load.  So let's say AverageSystemLoadMetric is the above seen 1.81640625 value with 100 requests/second, putting RequestCountLoadMetric at .1 load, so DynamicLoadBalanceFactorProvider would calculate the load factor like so:
>         int load = (int) Math.round(100 * totalWeightedLoad / totalWeight);
>         int load = (int) Math.round(100 * (1.81640625 * 2 + 0.1) / 3);
>         load = 124.4
> And that gets truncated down to 100 so AverageSystemLoadMetric comes to represent really all of the load.  But if a 2 weight metric is at its max and a 1 weight metric is at .1%, then their total load should just be ~70-71%.
> S is mod_cluster assuming that the SystemLoadAverage will always be between 0 and 1?  Does it look like mod_cluster is not properly scaling this metric?  Do we know a definite max return to expect from OperatingSystemMXBean.getSystemLoadAverage() so that this metric can be scaled more in line with the other percentile based ones?  Or should a user definable max capacity be implemented into this metric as it is with others?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the mod_cluster-issues mailing list