[
https://jira.jboss.org/jira/browse/DNA-223?page=com.atlassian.jira.plugin...
]
Johnny Verhaeg commented on DNA-223:
------------------------------------
The XML Sequencer is indeed written with some incorrect assumptions in mind, in particular
by assuming the an XML document is always supported by a corresponding XML Schema. Some
of its behavior, while not directly supported by specification, is defensible and I'll
argue should remain with a few tweaks, and some is not. It's going to take quite a
bit to explain everything we should or could be doing and why, but her goes.
First let's just look at elements in XML documents that are not backed (via either the
"xsi:schemaLocation" attribute or some external means) by an XML Schema.
According to the XML Namespace spec, all non-prefixed elements belong to the default
namespace, whether that namespace is specified (via xmlns='...') or not. This is
where the sequencer in its current state is definitely behaving incorrectly. In the
example above, "sourceA" and "sourceB" should both be part of the
default namespace, and should not inherit the "dna" namespace of their parent
("dna:sources"). We should definitely fix this.
The attributes are a bit of a different story. According to the spec, as Randall
indicates above, non-prefixed attributes don't "automatically" get assigned
any particular namespace - it's left up to interpretation "determined by the
element", which pretty much means it depends upon the system that defines and/or
consumes the element. All XML entities will eventually get associated with what I'll
call an "effective" namespace. Within the scope of considering XML documents
alone, I will argue the "intuitive" expectation would be that the
attributes' namespaces are inherited from their containing element. But the problem
here is if document creator wants an attribute to explicitly belong to a different
namespace that happens to also be the default namespace. Looking at Randall's example
again, let's assume for the moment that the "sourceA" and
"sourceB" elements are both part of the DNA namespace (and prefixed
accordingly). There would be no way to force "retryLimit" (as strange as the
use-case may seem) to be part of the default namespace. The user would be forced to
declare a prefix that refers to the namespace for "retryLimit" and prefix the
attribute accordingly.
Now enter the world of XML Schemas. Again, let's first look at elements. The XSD
spec introduces a scenario, which unfortunately is fairly common, that basically violates
the behavior defined by the XML Namespaces spec. If all of the following are true:
<ul>
<li>An element is defined locally within a schema as part of a complex type
definition (i.e., it does not reference a globally defined element (i.e., an element
definition whose parent element is a <schema> element))</li>
<li>The schema defines a "targetNamespace"</li>
<li>The "effective" form of the element is
"unqualified"</li>
<li>The target namespace is <em>not</em> declared as the default
namespace in the XML document (i.e., it's declared with a prefix)</li>
</ul>
then when that element appears within an XML document without a prefix, its namespace is
still the target namespace. This scenario is what the current implementation of the
sequencer is incorrectly assuming, and as I stated above, should be corrected. However,
it should also still be allowed for under the given conditions. To do so (in a post-0.2
release), we'd have to enhance the sequencer to recognize when a schema is in effect,
load and parse that schema, determine whether these conditions all exist, and if so, apply
the target namespace to the element(s) in question. The good news is whether we provide
this enhancement or not, at least we don't have any ambiguous scenarios.
Finally, let's look at attributes backed by a schema. The same rules apply here as
with elements, which in this case means we actually have some situations where we know
what the phrase in the XML Namespaces spec, "determined by the element", means.
That's the good news. The bad is that whereas we might choose a default behavior when
we don't have a schema, such as always make non-prefixed attributes part of the
default namespace, we can't necessarily do that when a schema is present and one or
more of the conditions listed above <em>doesn't</em> apply - we now have
an ambiguous scenario with no definitive answer.
Now let's talk a little about user expectations, using the XML Schema of Schema (SoS)
as one of our examples. The SoS is not self-defining, meaning it isn't backed by
itself as its own schema (it is defined partially in terms of a DTD). If we look, for
instance, at the SoS definition for the "schema" element, we see its define
specifies an attribute called "targetNamespace". So, a typical XML Schema
document (which is just a particular type of instance of an XML document), might contain
the following element as its root:
<xs:schema
targetNamespace="http://www.jboss.org/dna"
xmlns:dna="http://www.jboss.org/dna"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
What would you expect the namespace of the "targetNamespace" attribute to be?
Randall and I both agree that it seems to make sense it would be part of the XML Schema
namespace (referred to by the "xs" prefix in this case). In other words, we
would expect the attribute to inherit its "effective" namespace from it
containing element. In fact, in all documents this is what I presume will be the
"interpretation...as determined by the element...".
So what do we do? Keeping in mind the expectation I just discussed, I suggest that we
provide configurable options for the sequencer to drive it's effective namespace
resolution behavior for non-prefixed attributes, with these available options:
<ol>
<li>Always use the default namespace</li>
<li>Always inherits the namespace from the containing element</li>
<li>A "mixed-mode" option that inherits the namespace from the containing
element only when no default namespace is specified, and uses the default namespace when
one is specified</li>
</ol>
where option 3, the "mixed-mode" option, is the default behavior. This last
option allows for the expectations we have for the commonly encountered scenarios
<em>and</em> for the "retryLimit" attribute in Randall's example
to be defined as belonging explicitly to a default namespace, while still adhering to both
the XML Schema and XML Namespaces specs. These options would apply regardless of whether
or not we've enhanced the sequencer to be "schema-aware" (as suggested above
as a post-0.2 feature enhancement).
As a parting note, the XML Schema spec's references to the the XML Namespaces spec are
all to an old version - Randall's references are all to the latest version of the XML
Namespace spec.
XML sequencer does not properly handle namespaces of unqualified
attributes
---------------------------------------------------------------------------
Key: DNA-223
URL:
https://jira.jboss.org/jira/browse/DNA-223
Project: DNA
Issue Type: Bug
Components: Sequencers
Affects Versions: 0.2
Reporter: Randall Hauch
Assignee: Randall Hauch
Fix For: 0.2
I've been looking at the XML sequencer behavior, and I was a little surprised to find
out that it treats unqualified attributes (i.e., those without a namespace prefix) as
inheriting the namespace of the parent element. For example, this XML document:
<dna:system
xmlns:dna="http://www.jboss.org/dna"
xmlns:jcr="http://www.jcp.org/jcr/1.0">
<!-- Define the sources from which content is made available -->
<dna:sources>
<sourceA name="Cars"
dna:classname="org.jboss.dna.connector.inmemory.InMemoryRepositorySource"
retryLimit="3" />
<sourceB name="Aircraft"
dna:classname="org.jboss.dna.connector.inmemory.InMemoryRepositorySource" />
</dna:sources>
</dna:system>
is imported so that the "retryLimit" attribute is turned into a
"dna:retryLimit" property. That property, however, should use the default
namespace.
I'm not sure whether this was intended, but I don't believe this behavior is
correct. Yes, unqualified child elements do inherit the namespace of their parent, but
attributes do not. According to the spec (
http://www.w3.org/TR/xml-names/#defaulting),
emphasis mine:
"The scope of a default namespace declaration extends from the beginning of the
start-tag in which it appears to the end of the corresponding end-tag, excluding the scope
of any inner default namespace declarations. In the case of an empty tag, the scope is the
tag itself.
"A default namespace declaration applies to all unprefixed element names within
its scope. Default namespace declarations do not apply directly to attribute names; the
interpretation of unprefixed attributes is determined by the element on which they
appear.
"If there is a default namespace declaration in scope, the expanded name
corresponding to an unprefixed element name has the URI of the default namespace as its
namespace name. If there is no default namespace declaration in scope, the namespace name
has no value. The namespace name for an unprefixed attribute name always has no value. In
all cases, the local name is local part (which is of course the same as the unprefixed
name itself)."
Unfortunately, none of the examples in the spec seem to show the behavior for attributes.
But here are a few links that seem to agree with my interpretation:
http://www.twoscomplement.com/2008/03/16/xml-attribute-namespaces/
http://annevankesteren.nl/2005/03/null-namespace
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira