[jboss-as7-dev] Detyped API document feedback

Tue Jan 4 15:17:24 EST 2011

Responding to the simple bit first...

On 1/4/11 11:02 AM, David M. Lloyd wrote:
> This commentary is in reference to version 2 of the detyped API document
> at:  http://community.jboss.org/wiki/AS7DetypedManagementAPI/version/2
>
<snip/>
>
> <trim>
>> * Provide clean handling for operations that inherently impact
>> multiple servers (i.e. updates to the domain.xml configuration model
>> and any host.xml configuration.) Allow the end user to specify how
>> those changes are applied to various server groups and the servers
>> within those groups in a "management operation plan". After the
>> operation is executed, provide detailed information on how the
>> operation was executed on the various controllers and servers
>> involved.
> <snip>
>> * Management operation plans should include support for multiple
>> operations that update persistent configuration that are to be
>> performed as an atomic unit.
>   >
>> - TODO: is it necessary to support plans
>> with multiple operations that are not atomic; i.e. where failure of
>> one operation will not cause the others to be rolled back? We
>> currently do, but this adds conceptual complexity and clutters the
>> API a bit.
>
> On the model level, no.  Either the whole update is applied or none of it.
>
> At runtime, I think that any part of a multi-step ("composite") update
> failing is equivalent to the whole update failing.  Also, any update, be
> it simple or composite, should only have one single plan which applies
> to the whole works (in other words, the plan for application and/or
> rollback may only be present on the top-most update entity which was
> submitted to the controller).
>
> This simplifies the whole issue.  I can say, "Run these 3 updates.  If
> they fail (at runtime), just keep it in there and I'll inspect and/or
> fix it by hand.", or "If they fail, roll them all back".  These are the
> only two sensible actions I can think of.
>

Yes, this is exactly what I intend to do. Single or multi-step updates 
must all apply successfully to the model on the DC and each HC, or they 
will be reverted. If they do apply successfully, they can be applied to 
each server. They must all apply to the model successfully on the 
server; if not they will be reverted. Whether failure to apply to the 
runtime triggers rollback on that server is controllable via a single 
boolean param.

The TODO above was about whether or not that single param should exist, 
or whether "If they fail, roll them all back" should be the only 
behavior. For now I'm sticking with leaving both "If they fail (at 
runtime), just keep it in there and I'll inspect and/or fix it by hand" 
and "If they fail, roll them all back" as options. But we need to think 
through what "I'll inspect and/or fix it by hand" really means. That is, 
does applying an update to the server's model but having it not 
reflected in the runtime allow any solution other than a server restart?[1]

What happens to other servers if a server needs to roll back is more 
configurable. At this point, I have it:

1) At the server group level, users can configure how many servers (or 
what % of the total) can fail before all servers in the group are reverted.

2) Users can configure whether rollback of one server group triggers 
rollback of all the others. There are more possibilities than this 
simple boolean choice allows, but I prefer to keep it simple unless we 
find an easy, intuitive way to describe more complex options.

The Alpha1 domain deployment API also had another level, where what's 
described above is one level of the plan, and then a list of those could 
be grouped into the overall plan, with a param to trigger rollback of 
the whole lot. So, you could have some (single or multi-step) update 
that gets applied to the 2 server groups in your messaging tier. Then 
another that gets applied to the 2 server groups in your web tier. And 
then a flag to trigger rollback of the messaging stuff if the web stuff 
fails. I want to get rid of this level, as it complicates the more 
typical case, and put the responsbility for it on the client. Instead 
I'd prefer to have the result of the messaging tier update include the 
operation needed to revert the changes. The user can then use that 
information to revert the messaging tier change if the web tier change 
fails.

[1] If, as we've discussed, the update handler only uses the update 
parameters and the current runtime state when applying the update (and 
ignores the current values in the model), then some attempt at 
re-applying the update to the runtime is possible.

-- 
Brian Stansberry
Principal Software Engineer
JBoss by Red Hat