]
Work on TEIID-5965 stopped by Steven Hawkins.
---------------------------------------------
Allow variables to be used as TextTable delimeters, row delimeters,
quote, header, skip rows, and escape characters
-------------------------------------------------------------------------------------------------------------------
Key: TEIID-5965
URL:
https://issues.redhat.com/browse/TEIID-5965
Project: Teiid
Issue Type: Enhancement
Components: Query Engine
Affects Versions: 13.1
Reporter: Dmitrii Pogorelov
Assignee: Steven Hawkins
Priority: Major
Fix For: 15.0
Original Estimate: 5 hours
Remaining Estimate: 5 hours
In the specified example, the delimiter is TAB. Many web APIs allow customization of the
delimiter character, and in order to provide a generic parser, sometimes it is way easier
to define a delimiter as a variable, and not need to create a long nested structure with
IF-THEN-ELSE-IF-ELSE constructs:
{code:sql}
Select * From TextTable (
'c1 c2
1 2'
Columns
c1 integer,
c2 integer
Delimiter E'\t'
Header 1
)x;
{code}
Imagine that based on setup on API side (out of our control) the content can be delivered
via tab or semicolon, e.g.
{code}
c1;c2
1;2
{code}
or
{code}
c1 c2
1 2
{code}
Let's save this response into a variable and see the code, which we will need to
write depending on the setup:
{code:sql}
Begin
...
If (delimiter = 'tab')
Begin
Select * From TextTable (
apiResponse
Columns
c1 integer,
c2 integer
Delimiter E'\t'
Header 1
)x;
End
Else If (delimiter = 'tab')
Begin
Select * From TextTable (
apiResponse
Columns
c1 integer,
c2 integer
Delimiter ';'
Header 1
)x;
End
End
{code}
The if-else block is constantly growing, especially if we want to customize quote,
escape, delimiter, and row delimiter.
Thus it would be great if we could make these values configurable. In this case, we could
end up with this expected code, which is more readable and more easily customizable:
{code:sql}
Begin
...
Declare string delimiter = E'\t';
Select * From TextTable (
apiResponse
Columns
c1 integer,
c2 integer
Delimiter delimiter
Header 1
)x;
End
{code}
With a bit of tweaking and certain assumptions, leading in the trust level, we can even
read the first line and try to auto-detect the delimiter automatically (e.g. by counting
tabs, commas, and semicolons in the first line).
Please, could you be so kind as to make HEADER and SKIP values customizable (rather than
hardcoded numbers)?