[teiid-issues] [JBoss JIRA] (TEIID-1819) Reading multi entity data from a single data file

Thursday, 10 November 2011

    [
https://issues.jboss.org/browse/TEIID-1819?page=com.atlassian.jira.plugin...
] 

Steven Hawkins commented on TEIID-1819:
---------------------------------------

Our internal format for compiling metadata from Teiid Designer is exactly in the format
you're describing.  A single character record type followed by a delimiter and the
delimited record with the possibility for continuation records.  So we're very
familiar with this type of file.

We'll add a SELECTOR argument to TEXTTABLE.  The base implementation would be a string
that matches the line prefix.  Given that the value would be a literal, a compiled regex
pattern would not have that much performance degradation, e.g. '^x' vs. a prefix
comparison of 'x'.  However if that complexity is not needed, we can stick with
just a prefix pattern or go with something in the middle, such as a like pattern.  Is
you're preference to just match based upon a prefix string?  Are you also requesting
the ability to support continuation records?

TEXTTABLE has support for quoted values and the use of an escape character.  The data/time
conversion, if their not the same as what is allowed by the built-in type conversion,
would be specified in a select expression using a parseTime, parseDate, parseTimestamp
function.  And just as you say there's no need to overdo the TEXTTABLE function, as
dealing with field length or referring to other rows could also be done after the primary
extraction.

...
 Reading multi entity data from a single data file
 -------------------------------------------------

                 Key: TEIID-1819
                 URL: https://issues.jboss.org/browse/TEIID-1819
             Project: Teiid
          Issue Type: Feature Request
          Components: Query Engine
    Affects Versions: 7.6
         Environment: Any
            Reporter: Peter Larsen
            Assignee: Steven Hawkins

 A common problem for data files is the concept of multiple data sets inclosed in the same
file. An example is a data file of accounts receivable orders. You'll export at least
two logical entities: Orders and OrderLines. Each of the two entities have very different
data sets; the relate (OrderLines belong to a particular Order) and there are a dynamic
number of OrderLines per Order.
 A common way to differentiate is to put a special "record type" selector as the
first field in each record. Ie. A and B. The load program will based on this selector
apply different templates to map the columns, and it will also know that the OrderLines
are associated with the Order above it and create that relation column ID in the out put.
 Example:
 ;selector=A,orderdate,ordernumber,customernumber,ordertotal,ordertax
 ;selector=B,lineno,itemno,description,quantity,priceach,pricetotal
 A,10-dec-2011,12345,3322,3000,222
 B,1,123,Sprockets Black,30,50,1500
 B,2,333,Sprockets Blue,300,5,1500
 A,11-dec-2011,12346,3311,.....
 etc.  
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

[teiid-issues] [JBoss JIRA] (TEIID-1819) Reading multi entity data from a single data file