The work to simplify registration of metrics described in this mail was merged[1] in
wildfly-core a few weeks ago.
The idea is that the MRR will check the process type to determine whether the metric
definition must actually be registered or discarded. There were cases where the metrics
would have to be registered regardless of the process type (e.g. mainly for the jmx
subsystem and few other resources) and I added a FORCE_REGISTRATION flag to force the
registration of the metric.
I’m now continuing on that direction for runtime attributes[2] and operations[3].
A so-called runtime attribute is flagged with the STORAGE_RUNTIME flag documented as
"An attribute whose value is only stored in runtime services, and isn't stored in
the persistent configuration.”
I assume that in the definition “runtime services” means “runtime MSC services”. Please
correct me if that’s not the case (and don’t bother reading the rest of the mail :)
Looking at wildly-core and wildfly codebase, We have 2 distinct subsets of runtime
attributes:
(a) attributes which require runtime MSC services to be handled
(b) attributes which don’t require runtime MSC services but compute/store their value from
some other runtime.
Type (a) corresponds to the STORAGE_RUNTIME definition but type (b) differs (its value is
stored neither in runtime service, not in the persistent configuration)
The type (a) is the majority of runtime attributes and all subsystem (except JMX) uses
this type.
The type (b) is often located in our core-service resources or on HC/server root
resources.
I have a branch where I flagged all the type (b) with the same FORCE_REGISTRATION flag
than the metrics and it does the job.
But I find this flag poorly named. It tells the MRR what to do with the attribute/metric
but it does not describe the attribute/metric in a meaningful way.
How is a subsystem writer supposed to know when to use this flag or not?
I’d like to change it and rename it to something like RUNTIME_SERVICE_NOT_REQUIRED. A
subsystem writer could use this flag to specify that a runtime attribute does not require
runtime service to handle the attribute.
Runtime attribute of type (a) would be flagged with STORAGE_RUNTIME
Runtime attribute of type (b) would be flagged with STORAGE_RUNTIME and
RUNTIME_SERVICE_NOT_REQUIRED
The MRR could then decide whether to register the attributes depending on the process type
and the presence of these flags.
Another possibility would be to add a 3rd STORAGE flag such as
STORAGE_RUNTIME_BUT_NOT_MSC_RUNTIME (I suck at naming things!) for the type (b).
I don’t like this as attributes of type (b) will most likely always be implemented by core
developers (i.e. people who knows what they do) and I don’t want to add to our API a type
of STORAGE that subsystem developers will never have to use.
What do you think about changing the name of the FORCE_REGISTRATION and is
RUNTIME_SERVICE_NOT_REQUIRED a good renaming? (I don’t like having “negative” flag
though…)
The FORCE_REGISTRATION flag was introduced in a 3.0.0.Alpha1.
Would it be possible to rename it before 3.0.0 is released or should I add another flag +
@Deprecated, etc.?
jeff
[1]
https://github.com/wildfly/wildfly-core/pull/1541
[2]
https://issues.jboss.org/browse/WFCORE-1588
[3]
https://issues.jboss.org/browse/WFCORE-1586
On 25 Apr 2016, at 17:15, Jeff Mesnil <jmesnil(a)redhat.com>
wrote:
TL;DR - I propose to simplify subsystem development by moving some of the validation
logic from the resource definitions to the management resource registration. The goal is
to provide a static representation of the resources and let the MMR dynamically pick the
“meaningful” parts.
Last week an user complains that the messaging-activemq subsystem’s statistics were not
updated in domain mode.
It turned out that he was reading the metrics on the DC
(/profile=full/subsytem=messaging-activemq/…) instead of reading on the server
(/host=master/server=server-one/subsystem=messaging-activemq/…)
It is a bug[1] in the messaging-activemq because its resources register their metrics
without checking whether that makes sense. The correct check is to look at
context.isRuntimeOnlyRegistrationValid() to check whether a resource can register its
metrics (the same check applies also for runtime attributes/operations).
I looked at other subsystems and undertow has the same issue.
This check does not work well with the PersistentResourceDefinition that is used
throughout the messaging-activemq and undertow subsystems. This API works best when the
definition of the resources uses a bunch of static instances for the resource itself, its
attributes, metrics, etc. These static instances are also used by the companion
PersistentResourceXMLDescription to provide a static XML representation of the subsystem.
If I have to pass this context.isRuntimeOnlyRegistrationValid() boolean to every
resources in the subsystem, I get rid of the static representations used by the
PersistentResourceDefinition and PersistentResourceXMLDescription and I have to add a lot
of simple but boilerplate code in all my resource definitions.
The datasources subsystem does not exhibit this issue. It works around it by installing a
Service at RUNTIME step to register (resp. unregister) statistics resource definitions
when the service is started (res. stopped). Since services are only installed when runtime
is supported, it ensures that the datasources metrics are available only on server and not
on the DC.
It looks correct but I’m not a big fan of this solution. It makes the subsystem
definition more complex to understand and it also involves boilerplate code that every
subsystem providing runtime should write.
I was wondering if we could simplify the development of the subsystems by moving some of
the logic dealing with that use case in the ManagementResourceRegistration instead.
My proposal would be to have the resource definitions be “complete”. The resource always
defines its attributes/metrics/operations.
When the MMR takes this definition and registers the different parts, it would only
register the “meaningful” one depending on the ProcessType and RunningMode. E.g. the MRR
of the DC or a admin-only server would not register metrics & runtime
attributes/operations while the MMR on a normal server would register everything.
This increase the complexity of the MMR which has to figure out whether to actually
register something or discard it but it makes writing subsystem simpler and more robust.
Brian told me there might some exceptions (e.g. a runtime op that could be invoked on the
DC) but these case could be handled by adding metadata to the resource definitions.
This approach segues quite well with the idea to generate subsystem using annotations.
All the subsystem developers has to do is describe extensively the subsystem resources
(using static objects now, annotations in a future version) and let the MMR decides which
parts of the resources are actually registered.
To sum up, the definition of a resource is static, while its registration is dynamic.
Do you see any issue with that proposal?
As a first step, I’ll start by creating the corresponding WFCORE issue to fix WFLY-6546
issue by ensuring the MMR does not register metric if the runtime registration is not
valid. This should not have any API impact (but the behaviour will certainly change).
jeff
[1]
https://issues.jboss.org/browse/WFLY-6546
--
Jeff Mesnil
JBoss, a division of Red Hat
http://jmesnil.net/
--
Jeff Mesnil
JBoss, a division of Red Hat
http://jmesnil.net/