[jboss-as7-dev] Detyped API document feedback

Tue Jan 4 19:34:14 EST 2011

----- Original Message -----
> From: "Brian Stansberry" <brian.stansberry at redhat.com>
> To: jboss-as7-dev at lists.jboss.org
> Sent: Tuesday, January 4, 2011 1:17:24 PM
> Subject: Re: [jboss-as7-dev] Detyped API document feedback
> Responding to the simple bit first...
> 
> On 1/4/11 11:02 AM, David M. Lloyd wrote:
> > This commentary is in reference to version 2 of the detyped API
> > document
> > at:
> > http://community.jboss.org/wiki/AS7DetypedManagementAPI/version/2
> >
> <snip/>
> >
> > <trim>
> >> * Provide clean handling for operations that inherently impact
> >> multiple servers (i.e. updates to the domain.xml configuration
> >> model
> >> and any host.xml configuration.) Allow the end user to specify how
> >> those changes are applied to various server groups and the servers
> >> within those groups in a "management operation plan". After the
> >> operation is executed, provide detailed information on how the
> >> operation was executed on the various controllers and servers
> >> involved.
> > <snip>
> >> * Management operation plans should include support for multiple
> >> operations that update persistent configuration that are to be
> >> performed as an atomic unit.
> >   >
> >> - TODO: is it necessary to support plans
> >> with multiple operations that are not atomic; i.e. where failure of
> >> one operation will not cause the others to be rolled back? We
> >> currently do, but this adds conceptual complexity and clutters the
> >> API a bit.
> >
> > On the model level, no. Either the whole update is applied or none
> > of it.
> >
> > At runtime, I think that any part of a multi-step ("composite")
> > update
> > failing is equivalent to the whole update failing. Also, any update,
> > be
> > it simple or composite, should only have one single plan which
> > applies
> > to the whole works (in other words, the plan for application and/or
> > rollback may only be present on the top-most update entity which was
> > submitted to the controller).
> >
> > This simplifies the whole issue. I can say, "Run these 3 updates. If
> > they fail (at runtime), just keep it in there and I'll inspect
> > and/or
> > fix it by hand.", or "If they fail, roll them all back". These are
> > the
> > only two sensible actions I can think of.
> >
> 
> Yes, this is exactly what I intend to do. Single or multi-step updates
> must all apply successfully to the model on the DC and each HC, or
> they
> will be reverted. If they do apply successfully, they can be applied
> to
> each server. They must all apply to the model successfully on the
> server; if not they will be reverted. Whether failure to apply to the
> runtime triggers rollback on that server is controllable via a single
> boolean param.
> 
> The TODO above was about whether or not that single param should
> exist,
> or whether "If they fail, roll them all back" should be the only
> behavior. For now I'm sticking with leaving both "If they fail (at
> runtime), just keep it in there and I'll inspect and/or fix it by
> hand"
> and "If they fail, roll them all back" as options. But we need to
> think
> through what "I'll inspect and/or fix it by hand" really means. That
> is,
> does applying an update to the server's model but having it not
> reflected in the runtime allow any solution other than a server
> restart?[1]
> 

I'm a little concerned about the failure scenario and the entire idea of "fix it by hand".

I think we would be better off with the behavior of it always rolls back.  Besides simple mistakes like typos, if it fails, the person probably misunderstands something at a fundamental level, so "fixing it by hand", whatever that means, may just mean trouble.

I think we would be much better off investing our time in developing meaningful feedback for the user in the event of a failure.  In a UI we could make sure that there aren't conflicts with different/incompatible configuration changes, but in this detyped API, there is no such luxury, so we need to be able to return a meaningful error that helps debug the failure.

> What happens to other servers if a server needs to roll back is more
> configurable. At this point, I have it:
> 
> 1) At the server group level, users can configure how many servers (or
> what % of the total) can fail before all servers in the group are
> reverted.
> 
> 2) Users can configure whether rollback of one server group triggers
> rollback of all the others. There are more possibilities than this
> simple boolean choice allows, but I prefer to keep it simple unless we
> find an easy, intuitive way to describe more complex options.
> 
> The Alpha1 domain deployment API also had another level, where what's
> described above is one level of the plan, and then a list of those
> could
> be grouped into the overall plan, with a param to trigger rollback of
> the whole lot. So, you could have some (single or multi-step) update
> that gets applied to the 2 server groups in your messaging tier. Then
> another that gets applied to the 2 server groups in your web tier. And
> then a flag to trigger rollback of the messaging stuff if the web
> stuff
> fails. I want to get rid of this level, as it complicates the more
> typical case, and put the responsbility for it on the client. Instead
> I'd prefer to have the result of the messaging tier update include the
> operation needed to revert the changes. The user can then use that
> information to revert the messaging tier change if the web tier change
> fails.
> 
> 
> [1] If, as we've discussed, the update handler only uses the update
> parameters and the current runtime state when applying the update (and
> ignores the current values in the model), then some attempt at
> re-applying the update to the runtime is possible.
> 
> --
> Brian Stansberry
> Principal Software Engineer
> JBoss by Red Hat
> _______________________________________________
> jboss-as7-dev mailing list
> jboss-as7-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/jboss-as7-dev