Re: [teiid-dev] XML as a source

Thursday, 17 September 2009

Option 1 : don't leave the xml world

i.e. extract(xml, '/po', '/item/quantity/text() > 10',
'/number/text()' as Number integer)

Here the second argument specifies the nodes to consider, the third argument is the node
test, and the remainder perform the typed scalar extractions.  That's probably not
very satisfactory for ad hoc querying, but it's very clear how it operates and what it
should return. 

Option 2 : emulate the xml relational logic of adding synthetic primary keys

i.e. select PO.Number from (extract(xml, '/po', 'position()' as key,
'/number/text()' as Number integer)) AS PO,
(extract(xml, '/po/item', '../position()' as ParentKey,
'/quantity/text()' as Number integer)) AS Item where PO.key = Item.ParentKey and
Item.Quantity > 10

I'm omitting the node test parameter shown in option 1.  The downside here is that you
will perform multiple extractions and then join the results rather than option 1, which
would perform everything in one step.  This form does however make it a bit simpler to
create procedures and the corresponding proc relational queries that map to something that
looks relational.  For example there would be a PO stored procedure that takes a doc as a
parameter and returns the PO rowset, the procedure logic would be nothing more than
extract(xmlin, '/po', 'position()' as key, '/number/text()' as
Number integer);.  There would also be one for Item.  Then I could write a query like:

select PO.Number from PO, Item where PO.xmlin = xml and Item.xmlin = xml and PO.key =
Item.ParentKey and Item.Quantity > 10

Option 3 : natively understand hierarchical xml structures similar to oql/hsql and our
document model queries, in which the join between parent and child is implicit.

i.e. Select PO.Number from xml where PO.Item.Quantity > 10

however our xml document model query syntax in this case would project a document that
only contains the Number element, not just the scalar number value in a rowset.  We would
also have to know the document model / schema used in the from clause - not just that
it's typed as XML.

----- Original Message -----
From: "John Doyle" <jdoyle(a)redhat.com&gt;
To: "Ramesh Reddy" <rareddy(a)redhat.com&gt;
Cc: "teiid-dev" <teiid-dev(a)lists.jboss.org&gt;
Sent: Thursday, September 17, 2009 1:49:43 PM GMT -06:00 US/Canada Central
Subject: Re: [teiid-dev] XML as a source

I'm not really arguing against XPath as the syntax to express the mapping, it's a
good solution.  I'm describing a higher level problem that I don't see the
solution for.  

Lets say we have a virtual view we have created for a purchase order document and we have
defined three tables with the proposed new functions: PO, Address, and Item.  POs have
both BillTo and ShipTo Addresses as well as a list of Items.  Now we query this view with
the following:

Select PO.Number from PO, Item where Item.Quantity > 10;

If our virtual tables defined by our functions are just returning row sets, how does this
join get resolved?  How does the engine know?  In the XML-Relational implementation we
create keyed relations in the model and then insert keys into the document as it is parsed
so that we engine can resolve the join across the result sets we produce.

~jd

----- "Ramesh Reddy" <rareddy(a)redhat.com&gt; wrote:

...
 I see that in the current source model's table and column
property
 "NameInSource" already contains a xpath expression to get to column
 information from a given XML document. How is proposed function based
 model is different, than just making a what is source model now to a
 virtual model in future. However, this forces the XML --> Table
 mappings
 will be done at the engine level, thus pushes relational issues to
 engine. Am I interpreting the scenario correctly?

 On Thu, 2009-09-17 at 12:05 -0400, John Doyle wrote:
 > I wanted to pick up the discussion started in TEIID_817 around
 improvements/alternatives to the current implementation of the XML
 Connectors (https://jira.jboss.org/jira/browse/TEIID-817). 
 Specifically I want to get a better understanding of the idea Steve
 presented.
 > ___
 > ...
 > The alternative we've talked about before is to treat the source
 access and table extraction as primary and separate functions of
 Teiid. Executing a web service could have a procedure like xml
 getDoc(xml param) - which is essentially the approach of the XML
 Source logic. On top of that there can be helper SQL/XML like
 functions to map a rowset to xml, such as xmlforest, and the result
 xml to "tables" - something like extract(xml param, xpath as col1,
 xpath as col2....). At the very least adding a rowset or table type.
 > __
 > 
 > 
 > How do you envision exposing the functions, specifically functions
 that turn XML into a rowset or table, to users?  Would they be
 available in Transformations that define virtual tables?  This would
 retain the central value of the XML-Relational connectors, and remove
 much of the complexity of how it does it.
 > 
 > One of the challenges in this is that we need to be able to extract
 multiple virtual tables from a single XML document, and retain
 (create) the relationships between them.  For instance, if we extract
 PO, Address and Item tables from a Purchase Order xml document, we
 need to be able to map the item to the purchase order that it came
 from.  We do this currently by adding id's to the xml document and
 exposing them in the model we generate.
 > 
 > One thing I'll note is that passing around XML has it's limits. 
 Some of our customers are handling very large XML.
 > 
 > ~jd

 _______________________________________________
 teiid-dev mailing list
 teiid-dev(a)lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/teiid-dev
_______________________________________________
teiid-dev mailing list
teiid-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/teiid-dev

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009