[jboss-as7-dev] Detyped and REST configuration API

Tue Nov 30 01:03:21 EST 2010

On 11/29/10 10:18 PM, David M. Lloyd wrote:
> Inline.
>
> On 11/29/2010 05:15 PM, Brian Stansberry wrote:
>> Apologies in advance for the long message. I want to kick off a
>> discussion of the detyped and REST API for domain/host/server
>> configuration.
>
> Exciting!
>
>> Much of the following will end up in a wiki, but I wanted
>> some initial discussion first.
>>
>> Goals:
>>
>> A "detyped" Java version of the configuration API. Detyped in the sense
>> that the entire model can be represented using a few general usage
>> classes on the client classpath,
>> a la the jboss-meta project used in AS 5/6.
>
> I think it would be a good idea to add in, as a goal, "A way to fetch a
> description of the available model and update operations" (maybe by
> address).  In other words, I should be able to not only query what
> subsystems are available and what their properties are, but what the
> layout *means* and what updates are available and why.  This might allow
> us to be able to generate some or all user interfaces dynamically.
>

I'd like to hear from Heiko how useful he thinks this would be to him. 
(Not opposed; I just want to hear from a consumer of our management APIs.)

> (I see you talk about this down below so I'll postpone further
> discussion to that point).
>
>> A REST version of the API. Even less strongly typed, since in the end
>> clients will fundamentally be consuming the model as text (xml or json).
>>
>> Concepts:
>>
>> "Addressable" Model Elements
>>
>> An "addressable" model element is one that can be uniquely identified
>> within it's overall model.
>>
>> It either:
>> ** is the only element of its type under its parent element (i.e.
>> maxOccurs=1 in its schema)
>> ** or has a single attribute whose value is unique among all elements of
>> its type under it's parent element
>
> In other words, what we usually stuff into "name"?
>

Yes.

> I'd add one more option though: lists.  I should be able to address
> list-like things by absolute index starting from 0 without it needing to
> be an explicit attribute of the model element in question.
>

OK.

>> and it is either
>> ** the child of an addressable model element
>> ** or is the root element of the model
>>
>> An addressable model element can easily be treated as a resource in a
>> REST interface. A particular element can be represented in a URL path
>> either via its element name or via a simple combination of its element
>> name and the single attribute value.
>
> In other words, something like:
>
> http://localhost:8080/management/domain/profile/foo/subsystem/org.jboss.logging/root-logger/level
> ?
>

Yes.

>> It's also easy to create a simple java object to represent the element's
>> address for use in the "detyped" Java API. For a draft see:
>>
>> https://github.com/bstansberry/jboss-as/commit/36763f82e6d82631370adcc154193d02ddf4829c#diff-5
>> https://github.com/bstansberry/jboss-as/commit/36763f82e6d82631370adcc154193d02ddf4829c#diff-3
>
> I'm not sure it makes sense to make both the attribute name and value
> part of the address.  The attribute name that is used to identify a
> given child of another element should be defined on that element,
> otherwise I think it's going to be fairly error-prone.
>

Yeah, the name isn't needed. It makes a more informative toString(), but 
that's not all that important.

> Also as an aside, I'd make the address class(es) final&  immutable just
> to avoid problems down the road.
>

Thanks. I stole ElementAddress from a similar class in JBossCache and 
didn't notice the fields weren't final. And +1 on making the classes final.

>> We should strive to have the vast majority of elements in the
>> domain/host/server models be addressable. This makes it easy to have
>> highly targeted updates; e.g. a REST PUT can specify the values for a
>> few attributes in a single element rather than needing to describe a
>> large portion of the model. So far we have largely met this goal.
>
> Agreed 100%.  I'd even go so far as to forbid model elements which are
> not addressable in some way - such a thing does not make any sense.
>

I'd thought I'd seen some in the logging subsystem config, but looking 
again, I think those were actually maxOccurs=1 elements.

I'd like to forbid non-addressable elements as well.

>> Updates via differencing:
>>
>> A model update consists of a client sending a detyped representation of
>> a part of the model. The update will include all of that portion of the
>> model; both modified and unmodified properties. The server compares that
>> detyped representation to the current state of the model and generates
>> the AbstractModelUpdate objects needed to bring the model up to the
>> desired state.
>
> I don't know if I agree with this approach.  This means that to do an
> update you have to do a read-modify-write, so there's a race condition.

In most cases there's inherently a race condition. The user has some 
idea of the state of the system which may be outdated, and then tries to 
perform an update. In a REST system this is handled with ETag headers 
and IF-Match headers.

>    In addition, there are many update types which require simultaneous
> updating of fields which means that there is a larger class of invalid
> inputs which would need specialized error messages such as "if you
> update field blah, you also have to update field bzzt" and so forth.
>

How is that avoidable with a detyped API? On the client side you have no 
update class to enforce preconditions.

> I'll grant that many of our updates are adds and removes.  But sometimes
> the address would be unknown until the add is complete (deployments for
> example); I think this general scenario (system-generated addresses) is
> perfectly valid.
>

That's a typical REST scenario; you add a new child to a parent resource 
and you get back the URL of that child, which tells you its id. That 
doesn't seem like a hard thing.

> And don't forget that some of our updates will be related to a model
> element but won't actually *update* that model element.  For example,
> "interrupt thread #5 from the pool defined by
> domain/subsystem/threads/pools/fred".  Or it might even be something
> like "interrupt thread #1234 via domain/subsystems/threads"; it might
> not relate to that specific an element.  Perhaps the updates themselves
> should be the combination of an update identifier with the address of
> any thing(s) being modified in the payload of the update itself.
>

Good; the topic of this thread was "configuration API" but bringing in 
discussion of non-configuration management operations is a good thing. 
Management operations always seemed more RPC-ish and less RESTful.

In general, my concern about the approach you're outlining is it feels 
RPCish and not RESTful. I'm actually more comfortable with RPC, but we 
have a requirement to provide a REST interface and I'm concerned about 
straying so far from REST principles that the REST interface becomes an 
unmaintainable hack.

> Another prime example would be deployment.  A deployment plan contains
> model information (i.e. the deployment name and content) as well as
> non-model information (i.e. the plan itself).  It wouldn't be good to
> try to cram this into a pseudo-update of some sort of virtual fields or
> something.
>

I see a deployment plan (or any updates that want deployment-plan-like 
behavior) as being a resource that gets created, manipulated and then 
can be referred back to (e.g. to obtain results).

> I think that while we *can* support generic CRUD-y/diff-based update
> operations, at the very least we should support special-purpose updates
> as well, maybe even in preference to simple CRUD updates.  We do know
> from experience that diff-based updates can get pretty complex.
>
>> See the DetypedModelElement class for a draft detyped representation of
>> an addressable model element:
>>
>> https://github.com/bstansberry/jboss-as/commit/36763f82e6d82631370adcc154193d02ddf4829c#diff-2
>>
>> At this point I used the CompositeValue class from AS 5/6 jboss-meta
>> library for representing the element's properties. That's certainly open
>> for discussion.
>>
>> "Id-Only" Elements
>>
>> The detyped representation of a part or all of the model can include
>> "id-only" elements. These represent addressable elements but only
>> include the information necessary to determine the element address, plus
>> a flag indicating the element is "id-only". These serve as placeholders
>> within a representation of a larger portion of the model. The purpose of
>> an id-only element is to allow the client to avoid dealing with
>> uninteresting portions of the model state, while still being aware of
>> their existence. They also allow the server to distinguish an update
>> that removes an element from one that doesn't modify it at all, without
>> forcing the client to send the entire state of the unmodified element.
>>
>> How useful this "id-only" concept will be, I'm not sure. The main use
>> cases I can see are, for reads, being able to read a client-specified
>> number of levels of the model without having to consume the whole thing
>> (imagine a huge domain). For writes the main use case would be an update
>> that touches a few parts of the model (e.g. add an extension, the
>> subsystem config for it, a new thread pool for it, a new socket binding
>> for it) without having to specify all the unrelated stuff in between.
>
> We'll see: if an element is fully addressed from the outset, it
> shouldn't be necessary to provide a "skeleton" because the structure
> wouldn't be needed as it is not ambiguous what an element is or where it
> goes.
>
> In other words I might ask for all deployments in server group X, and
> I'd get back a list of fully addressed model elements.  Or I might ask
> for a specific configuration of a connector in a profile and get back
> just that connector, but fully addressed - it wouldn't be rooted in the
> domain.
>

Or you might ask for all the subsystems in a profile, but not want the 
full details; just their "ids". Granted, that could be handled as a 
separate query, a sort of "getChildIds()" instead of "getChildren()". 
For reads I saw an id-only element as a placeholder for children, not 
for uninteresting parents. As you say, there is no need for 
uninteresting parents; that's what the address is for.

For writes, the complex update case I describe above can handled by 
making the client provide multiple targetted updates (add extension, add 
thread pool, add socket binding, add subsystem) wrapped in an atomic 
update plan.

Anyway, like I said I'm not sure the concept is useful; if not I'm happy 
to chuck it as it adds complication.

I'm sleepy so I'll respond to the rest of this tomorrow (except to say 
that I like your point about getting rid of the typed element class). 
Thanks for the input.

>> Issues:
>>
>> Meta-information about Configuration Elements:
>>
>> Where should this be stored? Basic things like the primitive types of
>> various properties, plus more complex things like whether changing a
>> property requires a restart to take effect. The "detyped" Java version
>> can include a fair amount of meta-information, but this isn't readily
>> available via REST. Much of this information also can be (and is) stored
>> in the domain model schemas themselves, but this forces clients to
>> access and interpret the schemas to obtain the information. Putting
>> things like whether a property requires restart to take effect in the
>> various domain model schemas would also create a tighter coupling
>> between the schemas; i.e. they should all use the same mechanism to
>> indicate this, which would imply some common included schema that
>> defines that mechanism.
>
> For updates at least, there is a logical "place" for it: whatever
> registers the ability to handle a specific kind of update could also be
> required to specify the exact description for the operation (which could
> possibly even be used for validation purposes, though past experience
> demonstrates that it is unlikely that we'd be able to provide a truly
> comprehensive description language for validation).  In my earlier vein,
> non-CRUD updates could easily contain info about whether it requires an
> update.  CRUD is trickier.  I'm not sure we can say, "changing this
> field requires a restart", especially if a field cannot be independently
> changed without changes to an adjacent related field.  And we can't
> really say "changing _any_ field on this element requires a restart",
> because that's just lame if only one field really does require a restart
> when changed.
>
> For model elements, I think there's two parts to this.
>
> The first part is a sort of implicit contract for common fields.  For
> example, if we have a common field called "name", we should always treat
> that field similarly regardless of where it appears as a matter of
> convention.  In this way, if there is no description available for a
> given model element and all you have is its data, then you can still do
> something reasonably intelligent with the field contents.
>
> Secondly, we're really dealing with two "kinds" of model elements.  The
> first "kind" is a model element which is inherent in the core structure
> of the model, be it host, domain, or server.  The model description for
> these things should also be inherent in the bootstrap code somehow, i.e.
> not dynamic.
>
> The second "kind" is subsystem configuration.  It would be reasonable to
> require the model description to be registered at the time the subsystem
> is initialized.
>
>> Obtaining a DetypedModelElement
>>
>> This seems fairly straightforward. Any AbstractModelElement that is
>> addressable should be able to expose its state as a DetypedModelElement.
>> See AbstractAddressableModelElement for an rough implementation of that:
>
> I see your addressable model element and raise you... to the base class.
>    All elements should be addressable.
>
> Take the next logical step though: what reason is there to even have a
> typed model element class?  The only thing you'd ever use it for is
> getting a detyped representation of it.  So why have so much duplication
> of code and data spread all over the place.  You could represent each
> model in a single class of logic plus a detyped element data blob just
> as effectively, and save the headache of conversion when someone
> requests a hunk of it.
>

That sounds good, but I'd better sleep on it. :-)

>> https://github.com/bstansberry/jboss-as/commit/36763f82e6d82631370adcc154193d02ddf4829c#diff-1
>>
>> Determining differences and generating AbstractModelUpdate objects
>>
>> If the client passes back a DetypedModelElement to the domain
>> controller, we need logic to determine the difference between that
>> detyped representation and the current model and then generate the
>> needed update objects. A possibility is to have each
>> AbstractAddressableModelElement be responsible for doing that. A problem
>> I've seen with that is the class hierarchy for the update classes versus
>> the model element classes is not clean.
>> For example, a method like this in AbstractAddressableModelElement won't
>> work:
>>
>> protected abstract void addUpdatesToMatchView(DetypedModelElement
>> updatedView, List<AbstractModelElementUpdate<?>>   updates);
>>
>> It won't work because not all model elements are updated via
>> AbstractModelElementUpdate; some are done via AbstractDomainModelUpdate,
>> some via AbstractHostModelUpdate etc.
>
> Make it all go away.  I think the detyped system should be the core of
> our model representation and update handling.  Let's keep the *concepts*
> we've established, but save ourselves the torture of trying to port the
> actual code of the Old Way on to the New Way like we've done multiple
> times in the past with sweeping update changes.
>
>> Marshalling to from XML/JSON
>>
>> The REST API requires the ability to marshall to/from whatever XML and
>> JSON formats we use for exchanging data with clients. JSON of course is
>> going to be different from anything we have right now. We need to think
>> about the XML format. We could just provide and receive snippets that
>> follow the same schema we use in the domain/host/standalone.xml
>> documents themselves. However, there are problems with that. The
>> "id-only" notion described  above is not part of those schemas. Any Atom
>> links we include in response documents (see Chapter 9 in Bill Burke's
>> RESTful Java book) will violate the schema as well. As will any metadata
>> we decide to include. We may be better off with a more generic
>> representation that more directly maps to something like
>> DetypedModelElement.
>
> Again by using a single class of logic plus a detyped blob for each
> model, we make the [un]marshalling problem a lot simpler.  Also let me
> make sure that my stance on replacing chunks of the model is clear: I'm
> against.  Too much hidden complexity.
>

-- 
Brian Stansberry
Principal Software Engineer
JBoss by Red Hat