[jboss-as7-dev] Detyped and REST configuration API

Mon Nov 29 23:18:49 EST 2010

Inline.

On 11/29/2010 05:15 PM, Brian Stansberry wrote:
> Apologies in advance for the long message. I want to kick off a
> discussion of the detyped and REST API for domain/host/server
> configuration.

Exciting!

> Much of the following will end up in a wiki, but I wanted
> some initial discussion first.
>
> Goals:
>
> A "detyped" Java version of the configuration API. Detyped in the sense
> that the entire model can be represented using a few general usage
> classes on the client classpath,
> a la the jboss-meta project used in AS 5/6.

I think it would be a good idea to add in, as a goal, "A way to fetch a 
description of the available model and update operations" (maybe by 
address).  In other words, I should be able to not only query what 
subsystems are available and what their properties are, but what the 
layout *means* and what updates are available and why.  This might allow 
us to be able to generate some or all user interfaces dynamically.

(I see you talk about this down below so I'll postpone further 
discussion to that point).

> A REST version of the API. Even less strongly typed, since in the end
> clients will fundamentally be consuming the model as text (xml or json).
>
> Concepts:
>
> "Addressable" Model Elements
>
> An "addressable" model element is one that can be uniquely identified
> within it's overall model.
>
> It either:
> ** is the only element of its type under its parent element (i.e.
> maxOccurs=1 in its schema)
> ** or has a single attribute whose value is unique among all elements of
> its type under it's parent element

In other words, what we usually stuff into "name"?

I'd add one more option though: lists.  I should be able to address 
list-like things by absolute index starting from 0 without it needing to 
be an explicit attribute of the model element in question.

> and it is either
> ** the child of an addressable model element
> ** or is the root element of the model
>
> An addressable model element can easily be treated as a resource in a
> REST interface. A particular element can be represented in a URL path
> either via its element name or via a simple combination of its element
> name and the single attribute value.

In other words, something like:

http://localhost:8080/management/domain/profile/foo/subsystem/org.jboss.logging/root-logger/level 
?

> It's also easy to create a simple java object to represent the element's
> address for use in the "detyped" Java API. For a draft see:
>
> https://github.com/bstansberry/jboss-as/commit/36763f82e6d82631370adcc154193d02ddf4829c#diff-5
> https://github.com/bstansberry/jboss-as/commit/36763f82e6d82631370adcc154193d02ddf4829c#diff-3

I'm not sure it makes sense to make both the attribute name and value 
part of the address.  The attribute name that is used to identify a 
given child of another element should be defined on that element, 
otherwise I think it's going to be fairly error-prone.

Also as an aside, I'd make the address class(es) final & immutable just 
to avoid problems down the road.

> We should strive to have the vast majority of elements in the
> domain/host/server models be addressable. This makes it easy to have
> highly targeted updates; e.g. a REST PUT can specify the values for a
> few attributes in a single element rather than needing to describe a
> large portion of the model. So far we have largely met this goal.

Agreed 100%.  I'd even go so far as to forbid model elements which are 
not addressable in some way - such a thing does not make any sense.

> Updates via differencing:
>
> A model update consists of a client sending a detyped representation of
> a part of the model. The update will include all of that portion of the
> model; both modified and unmodified properties. The server compares that
> detyped representation to the current state of the model and generates
> the AbstractModelUpdate objects needed to bring the model up to the
> desired state.

I don't know if I agree with this approach.  This means that to do an 
update you have to do a read-modify-write, so there's a race condition. 
  In addition, there are many update types which require simultaneous 
updating of fields which means that there is a larger class of invalid 
inputs which would need specialized error messages such as "if you 
update field blah, you also have to update field bzzt" and so forth.

I'll grant that many of our updates are adds and removes.  But sometimes 
the address would be unknown until the add is complete (deployments for 
example); I think this general scenario (system-generated addresses) is 
perfectly valid.

And don't forget that some of our updates will be related to a model 
element but won't actually *update* that model element.  For example, 
"interrupt thread #5 from the pool defined by 
domain/subsystem/threads/pools/fred".  Or it might even be something 
like "interrupt thread #1234 via domain/subsystems/threads"; it might 
not relate to that specific an element.  Perhaps the updates themselves 
should be the combination of an update identifier with the address of 
any thing(s) being modified in the payload of the update itself.

Another prime example would be deployment.  A deployment plan contains 
model information (i.e. the deployment name and content) as well as 
non-model information (i.e. the plan itself).  It wouldn't be good to 
try to cram this into a pseudo-update of some sort of virtual fields or 
something.

I think that while we *can* support generic CRUD-y/diff-based update 
operations, at the very least we should support special-purpose updates 
as well, maybe even in preference to simple CRUD updates.  We do know 
from experience that diff-based updates can get pretty complex.

> See the DetypedModelElement class for a draft detyped representation of
> an addressable model element:
>
> https://github.com/bstansberry/jboss-as/commit/36763f82e6d82631370adcc154193d02ddf4829c#diff-2
>
> At this point I used the CompositeValue class from AS 5/6 jboss-meta
> library for representing the element's properties. That's certainly open
> for discussion.
>
> "Id-Only" Elements
>
> The detyped representation of a part or all of the model can include
> "id-only" elements. These represent addressable elements but only
> include the information necessary to determine the element address, plus
> a flag indicating the element is "id-only". These serve as placeholders
> within a representation of a larger portion of the model. The purpose of
> an id-only element is to allow the client to avoid dealing with
> uninteresting portions of the model state, while still being aware of
> their existence. They also allow the server to distinguish an update
> that removes an element from one that doesn't modify it at all, without
> forcing the client to send the entire state of the unmodified element.
>
> How useful this "id-only" concept will be, I'm not sure. The main use
> cases I can see are, for reads, being able to read a client-specified
> number of levels of the model without having to consume the whole thing
> (imagine a huge domain). For writes the main use case would be an update
> that touches a few parts of the model (e.g. add an extension, the
> subsystem config for it, a new thread pool for it, a new socket binding
> for it) without having to specify all the unrelated stuff in between.

We'll see: if an element is fully addressed from the outset, it 
shouldn't be necessary to provide a "skeleton" because the structure 
wouldn't be needed as it is not ambiguous what an element is or where it 
goes.

In other words I might ask for all deployments in server group X, and 
I'd get back a list of fully addressed model elements.  Or I might ask 
for a specific configuration of a connector in a profile and get back 
just that connector, but fully addressed - it wouldn't be rooted in the 
domain.

> Issues:
>
> Meta-information about Configuration Elements:
>
> Where should this be stored? Basic things like the primitive types of
> various properties, plus more complex things like whether changing a
> property requires a restart to take effect. The "detyped" Java version
> can include a fair amount of meta-information, but this isn't readily
> available via REST. Much of this information also can be (and is) stored
> in the domain model schemas themselves, but this forces clients to
> access and interpret the schemas to obtain the information. Putting
> things like whether a property requires restart to take effect in the
> various domain model schemas would also create a tighter coupling
> between the schemas; i.e. they should all use the same mechanism to
> indicate this, which would imply some common included schema that
> defines that mechanism.

For updates at least, there is a logical "place" for it: whatever 
registers the ability to handle a specific kind of update could also be 
required to specify the exact description for the operation (which could 
possibly even be used for validation purposes, though past experience 
demonstrates that it is unlikely that we'd be able to provide a truly 
comprehensive description language for validation).  In my earlier vein, 
non-CRUD updates could easily contain info about whether it requires an 
update.  CRUD is trickier.  I'm not sure we can say, "changing this 
field requires a restart", especially if a field cannot be independently 
changed without changes to an adjacent related field.  And we can't 
really say "changing _any_ field on this element requires a restart", 
because that's just lame if only one field really does require a restart 
when changed.

For model elements, I think there's two parts to this.

The first part is a sort of implicit contract for common fields.  For 
example, if we have a common field called "name", we should always treat 
that field similarly regardless of where it appears as a matter of 
convention.  In this way, if there is no description available for a 
given model element and all you have is its data, then you can still do 
something reasonably intelligent with the field contents.

Secondly, we're really dealing with two "kinds" of model elements.  The 
first "kind" is a model element which is inherent in the core structure 
of the model, be it host, domain, or server.  The model description for 
these things should also be inherent in the bootstrap code somehow, i.e. 
not dynamic.

The second "kind" is subsystem configuration.  It would be reasonable to 
require the model description to be registered at the time the subsystem 
is initialized.

> Obtaining a DetypedModelElement
>
> This seems fairly straightforward. Any AbstractModelElement that is
> addressable should be able to expose its state as a DetypedModelElement.
> See AbstractAddressableModelElement for an rough implementation of that:

I see your addressable model element and raise you... to the base class. 
  All elements should be addressable.

Take the next logical step though: what reason is there to even have a 
typed model element class?  The only thing you'd ever use it for is 
getting a detyped representation of it.  So why have so much duplication 
of code and data spread all over the place.  You could represent each 
model in a single class of logic plus a detyped element data blob just 
as effectively, and save the headache of conversion when someone 
requests a hunk of it.

> https://github.com/bstansberry/jboss-as/commit/36763f82e6d82631370adcc154193d02ddf4829c#diff-1
>
> Determining differences and generating AbstractModelUpdate objects
>
> If the client passes back a DetypedModelElement to the domain
> controller, we need logic to determine the difference between that
> detyped representation and the current model and then generate the
> needed update objects. A possibility is to have each
> AbstractAddressableModelElement be responsible for doing that. A problem
> I've seen with that is the class hierarchy for the update classes versus
> the model element classes is not clean.
> For example, a method like this in AbstractAddressableModelElement won't
> work:
>
> protected abstract void addUpdatesToMatchView(DetypedModelElement
> updatedView, List<AbstractModelElementUpdate<?>>  updates);
>
> It won't work because not all model elements are updated via
> AbstractModelElementUpdate; some are done via AbstractDomainModelUpdate,
> some via AbstractHostModelUpdate etc.

Make it all go away.  I think the detyped system should be the core of 
our model representation and update handling.  Let's keep the *concepts* 
we've established, but save ourselves the torture of trying to port the 
actual code of the Old Way on to the New Way like we've done multiple 
times in the past with sweeping update changes.

> Marshalling to from XML/JSON
>
> The REST API requires the ability to marshall to/from whatever XML and
> JSON formats we use for exchanging data with clients. JSON of course is
> going to be different from anything we have right now. We need to think
> about the XML format. We could just provide and receive snippets that
> follow the same schema we use in the domain/host/standalone.xml
> documents themselves. However, there are problems with that. The
> "id-only" notion described  above is not part of those schemas. Any Atom
> links we include in response documents (see Chapter 9 in Bill Burke's
> RESTful Java book) will violate the schema as well. As will any metadata
> we decide to include. We may be better off with a more generic
> representation that more directly maps to something like
> DetypedModelElement.

Again by using a single class of logic plus a detyped blob for each 
model, we make the [un]marshalling problem a lot simpler.  Also let me 
make sure that my stance on replacing chunks of the model is clear: I'm 
against.  Too much hidden complexity.

-- 
- DML