[teiid-dev] XML as a source
Steven Hawkins
shawkins at redhat.com
Thu Sep 17 15:49:10 EDT 2009
Option 1 : don't leave the xml world
i.e. extract(xml, '/po', '/item/quantity/text() > 10', '/number/text()' as Number integer)
Here the second argument specifies the nodes to consider, the third argument is the node test, and the remainder perform the typed scalar extractions. That's probably not very satisfactory for ad hoc querying, but it's very clear how it operates and what it should return.
Option 2 : emulate the xml relational logic of adding synthetic primary keys
i.e. select PO.Number from (extract(xml, '/po', 'position()' as key, '/number/text()' as Number integer)) AS PO,
(extract(xml, '/po/item', '../position()' as ParentKey, '/quantity/text()' as Number integer)) AS Item where PO.key = Item.ParentKey and Item.Quantity > 10
I'm omitting the node test parameter shown in option 1. The downside here is that you will perform multiple extractions and then join the results rather than option 1, which would perform everything in one step. This form does however make it a bit simpler to create procedures and the corresponding proc relational queries that map to something that looks relational. For example there would be a PO stored procedure that takes a doc as a parameter and returns the PO rowset, the procedure logic would be nothing more than extract(xmlin, '/po', 'position()' as key, '/number/text()' as Number integer);. There would also be one for Item. Then I could write a query like:
select PO.Number from PO, Item where PO.xmlin = xml and Item.xmlin = xml and PO.key = Item.ParentKey and Item.Quantity > 10
Option 3 : natively understand hierarchical xml structures similar to oql/hsql and our document model queries, in which the join between parent and child is implicit.
i.e. Select PO.Number from xml where PO.Item.Quantity > 10
however our xml document model query syntax in this case would project a document that only contains the Number element, not just the scalar number value in a rowset. We would also have to know the document model / schema used in the from clause - not just that it's typed as XML.
----- Original Message -----
From: "John Doyle" <jdoyle at redhat.com>
To: "Ramesh Reddy" <rareddy at redhat.com>
Cc: "teiid-dev" <teiid-dev at lists.jboss.org>
Sent: Thursday, September 17, 2009 1:49:43 PM GMT -06:00 US/Canada Central
Subject: Re: [teiid-dev] XML as a source
I'm not really arguing against XPath as the syntax to express the mapping, it's a good solution. I'm describing a higher level problem that I don't see the solution for.
Lets say we have a virtual view we have created for a purchase order document and we have defined three tables with the proposed new functions: PO, Address, and Item. POs have both BillTo and ShipTo Addresses as well as a list of Items. Now we query this view with the following:
Select PO.Number from PO, Item where Item.Quantity > 10;
If our virtual tables defined by our functions are just returning row sets, how does this join get resolved? How does the engine know? In the XML-Relational implementation we create keyed relations in the model and then insert keys into the document as it is parsed so that we engine can resolve the join across the result sets we produce.
~jd
----- "Ramesh Reddy" <rareddy at redhat.com> wrote:
> I see that in the current source model's table and column property
> "NameInSource" already contains a xpath expression to get to column
> information from a given XML document. How is proposed function based
> model is different, than just making a what is source model now to a
> virtual model in future. However, this forces the XML --> Table
> mappings
> will be done at the engine level, thus pushes relational issues to
> engine. Am I interpreting the scenario correctly?
>
> On Thu, 2009-09-17 at 12:05 -0400, John Doyle wrote:
> > I wanted to pick up the discussion started in TEIID_817 around
> improvements/alternatives to the current implementation of the XML
> Connectors (https://jira.jboss.org/jira/browse/TEIID-817).
> Specifically I want to get a better understanding of the idea Steve
> presented.
> > ___
> > ...
> > The alternative we've talked about before is to treat the source
> access and table extraction as primary and separate functions of
> Teiid. Executing a web service could have a procedure like xml
> getDoc(xml param) - which is essentially the approach of the XML
> Source logic. On top of that there can be helper SQL/XML like
> functions to map a rowset to xml, such as xmlforest, and the result
> xml to "tables" - something like extract(xml param, xpath as col1,
> xpath as col2....). At the very least adding a rowset or table type.
> > __
> >
> >
> > How do you envision exposing the functions, specifically functions
> that turn XML into a rowset or table, to users? Would they be
> available in Transformations that define virtual tables? This would
> retain the central value of the XML-Relational connectors, and remove
> much of the complexity of how it does it.
> >
> > One of the challenges in this is that we need to be able to extract
> multiple virtual tables from a single XML document, and retain
> (create) the relationships between them. For instance, if we extract
> PO, Address and Item tables from a Purchase Order xml document, we
> need to be able to map the item to the purchase order that it came
> from. We do this currently by adding id's to the xml document and
> exposing them in the model we generate.
> >
> > One thing I'll note is that passing around XML has it's limits.
> Some of our customers are handling very large XML.
> >
> > ~jd
>
> _______________________________________________
> teiid-dev mailing list
> teiid-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/teiid-dev
_______________________________________________
teiid-dev mailing list
teiid-dev at lists.jboss.org
https://lists.jboss.org/mailman/listinfo/teiid-dev
More information about the teiid-dev
mailing list