[jboss-jira] [JBoss JIRA] (WFLY-2500) Unable to reliably deduce metric value trends

Lukas Krejci (JIRA) jira-events at lists.jboss.org
Wed Nov 13 07:16:07 EST 2013


Lukas Krejci created WFLY-2500:
----------------------------------

             Summary: Unable to reliably deduce metric value trends
                 Key: WFLY-2500
                 URL: https://issues.jboss.org/browse/WFLY-2500
             Project: WildFly
          Issue Type: Feature Request
      Security Level: Public (Everyone can see)
    Affects Versions: 8.0.0.Beta1
            Reporter: Lukas Krejci


The metrics that wildfly collects are runtime values that are reset with each restart of the server.

This makes it impossible for remote tools that read those values only from the management model to reliably deduce change in their values across server restarts.

Consider for example the following scenario. We're trying to detect the number of invocations of some EJB method.

1) collection1: invocations = 1000
2) collection2: invocations = 1010
3) server restarts
4) huge spike in number of calls to the EJB method
5) collection3: invocations = 1020

in the above the "collectionN" represents the points in time when the user reads the value of the metric from the management model (by making a fresh connection to it and reading the value). For the user, it would seem that the value slowly increases (by 10 between each invocation) when the exact opposite would be true - between collection2 and collection3, there was a spike of 1020 invocations.

One of the simpler (IMHO) ways of fixing this would be to add a new runtime attribute to the root of the management model - startup-date. This would be populated at the server startup with the actual date when the server started. This assumes that there really is no way of resetting the metric values to 0 at runtime - for example, I tried to disable and enable statistics which DIDN'T reset the value.

The users then could remember the value of startup date with each collection of the runtime metrics. If the startup date changed from the last collection, the tool would know that the server restarted and could adapt the calculations accordingly.

The example above would then look like (the dates are timestamps):
1) collection1: invocations = 1000, startup-date=1
2) collection2: invocations = 1010, startup-date=1
3) server restarts
4) huge spike in number of calls to the EJB method
5) collection3: invocations = 1020, startup-date=2

It is then apparent that collection3 represents a spike because the user can deduce that the 1020 invocations were counted from 0 starting at startup-date=2.

(Note that this affects RHQ which is limited in its monitoring because of the inability to deduce the trending).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the jboss-jira mailing list