Brian Stansberry created WFLY-8850:
--------------------------------------
Summary: Correct runtime-only operations on profile resources
Key: WFLY-8850
URL:
https://issues.jboss.org/browse/WFLY-8850
Project: WildFly
Issue Type: Bug
Components: IIOP, JCA, JMS, JSF, mod_cluster, Naming, Security, Transactions,
Web (Undertow)
Reporter: Brian Stansberry
Assignee: Brian Stansberry
Attachments: runtimeonlyops.txt
OVERVIEW:
WFCORE-389 and WFCORE-2858 are about supporting runtime-only ops on profile resources,
something which we officially don't do (although there are violations of this policy
as is shown below.) As part of the decision as to whether to complete WFCORE-389 and
WFCORE-2858 for Core 3 / WildFly 11 I have performed an analysis of the existing
runtime-only ops in WildFly, looking for any issues. This JIRA is about correcting those
issues. The intent is to make initial changes that don't alter behavior in any
negative way in order to allow WFCORE-389 and WFCORE-2858 to proceed, but to do them in
such a way that if 389 and 2858 don't proceed there is no harm. For some items there
will be a follow up issue to make further decisions.
I performed 3 searches for operations that declare themselves as runtime-only, looking for
any aspect of their behavior that might be problematic for WFCORE-389. I created a
document which I'll attach with the results. I put a classifier to the left of each
item as a shorthand re: the status of the item..
CLASSIFIERS:
NP -- non-profile. Op is not used in profile resources, either because it is not
registered in subsystems, is only in deployment=*/subsystem=x resources, or the subsystem
does not register it in an HC. Any NP op is not relevant to WFCORE-389.
WR -- WRite. The op is not read-only.
WR? -- the op is not declared as read-only but seems to only being doing reads
AO -- the op is registered on the profile but execution is a no-op unless the process is
admin only
RU? -- the op doesn't seem to really be runtime-only
!!! -- Miscellaneously problematic ops
FIXES:
1) A number of ops have WR?/NP classifiers. The NP means these aren't relevant to
WFCORE-389 but correcting the metadata so they are declared as read-only is a useful minor
task.
2) The "migrate" ops in web, jacorb and messaging. These are registered on the
profile (allowing profile migration) but will fail if the process isn't admin-only.
An admin-only HC has no slaves or servers, so this means no domain-rollout of this op, and
hence WFCORE-389, is not relevant. This is all by design; it allows users to migrate the
subsystem in a domain profile. However, there is a question about them declaring
themselves runtime-only, since they modify config. Correcting this is another useful
minor task.
3) The "describe-migration" ops. Same discussion as for "migrate" plus
these don't seem to be write ops, so a minor useful side task is to correct the
metadata to describe them as read-only.
4) ModClusterConfigResourceDefinition registers 4 ops as runtime-only that seem to be
modifying configuration; i.e. they are not runtime-only. These have a tangential
relationship to WFCORE-389 in that they are pre-existing ops that break the
no-runtime-only-on-profile rule that WFCORE-389 is about rescinding. I'm not aware of
any issues reported about them so that's a tiny bit of additional evidence that the
kernel can handle such ops. But, a subtask of this issue is to correct the metadata for
these so they will not be affected by any subsequent changes related to runtime-only ops.
5) JcaCachedConnectionManagerDefinition.CcmOperations has two operations that are not
declared to be read-only that are registered on the profile resource. So these are
pre-existing ops that break the no-runtime-only-on-profile rule that WFCORE-389 is about
rescinding. A twist with these is they seem to actually be read-only and should be
described as such. But if we do that we must implement WFCORE-2858 to avoid breaking
existing behavior.
Nothing will be done about these as part of this work, but I'll file an issue to get
it sorted.
6) The JSF subsystem's "list-active-jsf-impl" op. A read-only, runtime-only
op that does runtime work (scanning modules) in Stage.MODEL on the profile resource. Lots
of rules being broken! What this op does now if invoked against the profile is tell you
what jsf impls are present on the DC. Which is *not* the same thing as telling you what
impls are present on "the domain" since different hosts in the domain can have
different sets of modules. So the op needs a rethink.
a) If we correct the Stage.MODEL problem, we can't do WFCORE-2849. So we need to
choose between the two.
b) If we do WFCORE-2858, this op will now start getting rolled out to the domain
servers resulting in getting data from all servers. This is arguably the correct behavior,
as now the user learns the true situation in the domain, not just on the DC. But if we
decide we don't want that we'll need to add
OperationEntry.Flag.HOST_CONTROLLER_ONLY to the operation definition to prevent that
rollout.
c) If we do roll it out to the servers we can consider having it no longer do runtime
work on the profile; i.e. don't analyze the DC, just the servers. That would remove
the conflict with WFCORE-2849, but would be an incompatible change in behavior. I find it
hard to believe anyone would be using this op in scripts though; not against the profile.
d) We could just stop registering it on the profile, but that's a loss of
functionality.
Choice b) would let WFCORE-2858 go forward and preserve the status quo for this op,
with a) c) and d) still options for the future.
7) The transaction subsystem's "probe" operation. A read-only, runtime-only
op registered on the profile resource but which is functionally a no-op if invoked on the
profile resource. But WFCORE-2858 would mean this now gets rolled out to all servers in
the domain that use the profile, triggering an actual probe on all. So, if we do
WFCORE-2858 we could:
a) Accept this, and let the op roll out. That should be an RFE though, with analysis
that rolling it out would be harmless.
b) Remove the op from the profile. It never did anything useful (just a no-op that
isn't rolled out) so removing it is only
a semi-breaking change.
c) Add OperationEntry.Flag.HOST_CONTROLLER_ONLY to the operation definition to prevent
that rollout.
Choice c) would let WFCORE-2858 go forward and preserve the status quo for this op,
with a) and b) still options for the future, so that's what will be done as part of
this work.
8) The messaging-activemq broadcast-group resource has problematic 'start' and
'stop' ops. These are not registered as runtime-only, but they are. They are
registered on the profile resource and are not read-only, so the DC rolls them out to the
domain. So, they are pre-existing ops that break the no-runtime-only-on-profile rule that
WFCORE-389 is about rescinding. We have two
options here:
a) remove these ops on the profile as violations of the no-runtime-only-on-profile
rule. This would be a breaking change. But it may be the correct thing to do anyway if it
is unsafe to invoke these on the profile and have that roll out to all servers.
b) Correct the description of these to declare runtime-only.
Nothing will be done on these as part of this work, but a separate issue will be filed.
9) The messaging-activemq broadcast-group resource also has problematic a
get-connector-pairs-as-json op. This is a read-only op so it currently will not roll out.
It will also fail if executed against the profile resource, as it fails if there is no
activemq server present. So, the options here are:
a) Remove the op from the profile resource. It never worked anyway.
b) Allow them to roll out. This would be new behavior though.
c) Add OperationEntry.Flag.HOST_CONTROLLER_ONLY to the operation definition to prevent
that rollout.
IMHO option c) is kind of silly, leaving a broken op in place, but it's a valid
"emergency" step to prevent roll out inadvertently being turned on while a
decision between a) and b) is made. So that's what will be done as part of this work.
10) The messaging-activemq cluster-connection resource has problematic 'start' and
'stop' ops. These are not registered as runtime-only, but they are. They are
registered on the profile resource and are not read-only, so the DC rolls them out to the
domain. So, they are pre-existing ops that break the no-runtime-only-on-profile rule that
WFCORE-389 is about rescinding. We have two
options here:
a) remove these ops on the profile as violations of the no-runtime-only-on-profile
rule. This would be a breaking change. But it may be the correct thing to do anyway if it
is unsafe to invoke these on the profile and have that roll out to all servers.
b) Correct the description of these to declare runtime-only.
Nothing will be done on these as part of this work, but a separate issue will be filed.
11) The messaging-activemq cluster-connection resource also has problematic a get-nodes
op. This is a read-only op so it currently will not roll out. It will also fail if
executed against the profile resource, as it fails if there is no activemq server present.
So, the options here are:
a) Remove the op from the profile resource. It never worked anyway.
b) Allow them to roll out. This would be new behavior though.
c) Add OperationEntry.Flag.HOST_CONTROLLER_ONLY to the operation definition to prevent
that rollout.
IMHO option c) is kind of silly, leaving a broken op in place, but it's a valid
"emergency" step to prevent roll out inadvertently being turned on while a
decision between a) and b) is made. So that's what will be done as part of this work.
12) A number of ops are using withFlags(OperationEntry.Flag.RUNTIME_ONLY) instead of
setRuntimeOnly(). The effect is the same so this is harmless but a minor useful side task
is to switch to setRuntimeOnly(). That will make it easier to find these ops.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)