]
Brian Stansberry commented on WFCORE-1106:
------------------------------------------
Re:
"A tricky bit with this part:
2) The kernel detects this and finds the registration for the foo=bar resource type, and
sees that the resource provides capability org.wildfly.foo.bar...."
I have implemented the following to account for the fact the child resources may be
"incorporated" in the config of a capability but don't themselves register
the capability:
1) The default behavior when seeing if a resource is affected by a reload-required cap is,
if the resource itself hasn't registered any capability to walk up the resource tree
looking for one that has, continuing until one is found or...
a) the resource being examined has subsystem=??? as the last element in its address.
Subsystem resources are not incorporated in the config of kernel capabilities.
b) the parent of the resource being examined is the root resource or a /host=??? resource.
Capabilities registered by those parents do not incorporate child resources in their
config.
2) The MRR for a resource type incorporated in a parent's capability can explicitly
declare this. Doing this is expected to be unusual, as it's just an optimization to
avoid the searching logic discussed above.
3) An MRR can also explicitly declare that the resources are *not* incorporated in *any*
parent resource's capability. This is useful when parent resources register a
capability, while *at the moment* child resources do not. The child resources are not part
of the parent capability but have no capability of their own. At some point devs will work
out the capability API for the child resources, and then the code will be modified and the
children will register a capability. Until this happens, letting the children declare they
are not part of the parent's capability lets them avoid being affected by an operation
on the parent triggering reload-required.
All of this is handled in a separate commit in the linked PR.
Better handling of subsequent changes once the server is placed into
reload-required
------------------------------------------------------------------------------------
Key: WFCORE-1106
URL:
https://issues.jboss.org/browse/WFCORE-1106
Project: WildFly Core
Issue Type: Enhancement
Components: Domain Management
Reporter: Brian Stansberry
Assignee: Brian Stansberry
When the handler for a configuration change operation determines that it cannot effect
the change to the current runtime services, it places the process into
"reload-required" state. From the moment this occurs until the reload is
performed, the configuration model is inconsistent with the runtime services.
This can lead to problems when, prior to reload, the user makes further configuration
changes. Those changes can succeed in Stage.MODEL, since the change is valid given the
current state of the configuration model, but then when the handler attempts to update the
runtime the changes fail because the runtime services are in a different state. Some
common scenarios:
1) User removes a resource, triggering reload required. Then they re-add the resource,
which fails with a DuplicateServiceException since the service from the original version
of the resource hasn't been removed yet.
2) User makes some other config change to a resource which can't be effected
immediately, so the server is put into reload-required. The user then adds another
resource that depends on the services from the first resource, and that add fails because
the runtime service from the first resource is not in the expected state.
A naive fix for this would be once the process goes into reload-required state to stop
making any further runtime changes for steps that alter the persistent config. (Runtime
changes for ops that don't touch persistent config would be ok, e.g. reload itself, or
runtime-only ops like popping a message off a JMS queue.)
The problem with the naive approach is config changes that could take immediate effect no
longer will. This could break existing scripts, or just be annoying in general. For
example, a server is in reload-required state but is still running. Then the user wants to
add a logger category or change the level of an existing one in order to get some
diagnostic info. The logging change would not affect the runtime until the reload is done,
forcing a reload to get the diagnostic data.
Stuart Douglas had an excellent suggestion today of looking into tying this in to
capabilities and requirements. So, for example:
1) An op targeted at resource foo=bar causes the process to go into reload-required.
2) The kernel detects this and finds the registration for the foo=bar resource type, and
sees that the resource provides capability org.wildfly.foo.bar.
3) The kernel records in the capability registry that org.wildfly.foo.bar is now
"reload-required".
4) Thereafter, for any op that changes the model and then adds a runtime step, the
kernel:
a) finds the registration for the resource type associated with that op's target
address
b) finds any capabilities provided by the resource type
c) looks for direct or transitive requirements for those capabilites that are
"reload-required"
d) if found, the runtime step is not executed, and instead the
"server-requires-reload" response-header is added.
The effect here is the granularity of what ops have their runtime changes skipped is
reduced to those associated with capabilities that put the server into reload-required.
Unrelated ops, e.g. the logging changes mentioned above, are unaffected.
Some fine points:
1) The restart-required and reload-required states need to be tracked separately. The
information regarding any restart-required capabilities needs to survive a reload.
2) The information that a capability is reload/restart-required needs to survive the
removal of the capability. This allows the remove+add scenario to work. The remove op
removes the capability, but the fact it is still present in the runtime is tracked, so
when the add comes in no runtime changes are made.