tl;dr;
When an operation that has changed the management model is rolling back,
the OperationStepHandlers processing the rollback see the model as it
was at the arbitrary point when the rollback began. I propose changing
this so they see things as they were prior to the first change to the model.
Long version:
When an operation mutates either the management Resource tree, or the
registry of capabilities (together known as the management model), we
clone the management model and thereafter that operation works on the
clone. The clone is invisible to other callers until the operations
successfully commits. When the operation commits, it publishes the clone
and that becomes the official model.
I call this copy-on-write, publish-on-commit.
If the operation rolls back, the changed model is never published and
the clone is just discarded when the operation execution returns.
However, while the operation is rolling back, all the
OperationStepHandlers that are processing the rollback see the modified
clone, not the original model. They are seeing arbitrarily incorrect data.
This hasn't been a problem up to now, as our standard OSHs get a copy of
whatever part of the Resource tree they are going to modify and keep it
for use in rollback. They don't need to re-read the management model to
perform rollback.
But this doesn't work for the capability registry. If an OSH removes a
capability and removes a service, and then in rollback tries to use the
capability to figure out how to restore the service, it fails, as
management model it can see still shows the capability as being removed.
To fix this, I propose discarding the cloned management model as soon as
rollback begins. Thereafter, an OSH processing rollback will see the
model as it was before the first modification. The removed capability
will still be present.
I have a commit that does this at [1]. Running the testsuite with it
shows no regressions. This doesn't surprise me, as our standard OSHs up
to now have had no need to re-read the model during rollback.
[1]
https://github.com/bstansberry/wildfly-core/commit/419f350931d5b7e345cf9a...
--
Brian Stansberry
Senior Principal Software Engineer
JBoss by Red Hat