[
https://issues.redhat.com/browse/TEIIDDES-3227?page=com.atlassian.jira.pl...
]
Nayan Bija updated TEIIDDES-3227:
---------------------------------
Description:
Hi, We are fetching data from CSV file(without header) using HXTT driver.
It is properly fetching records with the single charecter delimiter but not working with
multiple charector delimiter.
We are having delimited csv file without having headers which contains "*|~"
charector as a delimiter.
We have written below schema file for it.
Schema file
========
[target.csv]
ColNameHeader=False
Format=Delimited("~|*")
CharacterSet=ANSI
COL1=first_name varchar
COL2=last_name varchar
COL3=city varchar
Data
====
Sachin~|*Tendulkar~|*Mumbai
Saurav~|*Ganguly~|*Kolkata
so when we perform select operation on it "|*" character comes in the result
like below
|\|*Sachin|\|*Tendulkar|\|*Mumbai|
Datasource in standalone-teiid.xml
=======================
<datasource jndi-name="java:/FileDS20200713211347079"
pool-name="FileDS20200713211347079">
<connection-url>jdbc:text:sftp://abc:efg@10.10.10.10:22/home/abc/target.csv?odbcSchemaFile=target.sch;fileExtension=csv;delayedClose=0;refreshInterval=0</connection-url>
<driver>textfile</driver>
<pool>
<max-pool-size>20</max-pool-size>
</pool>
</datasource>
As I read in some blogs, they suggested to change the "*CharacterSet*" from
*ANSI* to *UTF-8* in schema file(.sch file).
So I tried to change it to UTF-8 as well as *65001* still getting same result
*refered link
- [https://www.ibm.com/support/pages/special-characters-data-imported-through-odbc-import-using-microsoft-text-driver-are-not-displayed-correctly-admin-client]*
we also tried to add";charSet=UTF-8" in datasource URL but still its giving same
kind of result.
Could you please suggest some solution for this issue.
was:
Hi, We are fetching data from CSV file(without header) using HXTT driver.
It is properly fetching records with the single charecter delimiter but not working with
multiple charector delimiter.
We are having delimited csv file without having headers which contains "*|~"
charector as a delimiter.
We have written below schema file for it.
Schema file
========
[target.csv]
ColNameHeader=False
Format=Delimited("~|*")
CharacterSet=ANSI
COL1=first_name varchar
COL2=last_name varchar
COL3=city varchar
Data
====
Sachin~|*Tendulkar~|*Mumbai
Saurav~|*Ganguly~|*Kolkata
so when we perform select operation on it "|*" character comes in the result
like below
|\|*Sachin|\|*Tendulkar|\|*Mumbai|
Datasource in standalone-teiid.xml
=======================
<datasource jndi-name="java:/FileDS20200713211347079"
pool-name="FileDS20200713211347079">
<connection-url>jdbc:text:sftp://abc:efg@10.10.10.10:22/home/abc/target.csv?odbcSchemaFile=target.sch;fileExtension=csv;delayedClose=0;refreshInterval=0</connection-url>
<driver>textfile</driver>
<pool>
<max-pool-size>20</max-pool-size>
</pool>
</datasource>
we also tried to add";charSet=UTF-8" in above URL but still its giving same kind
of result.
Could you please suggest some solution for this issue.
Issue with multiple character delimiter in CSV File.
----------------------------------------------------
Key: TEIIDDES-3227
URL:
https://issues.redhat.com/browse/TEIIDDES-3227
Project: Teiid Designer
Issue Type: Feature Request
Reporter: Nayan Bija
Priority: Major
Hi, We are fetching data from CSV file(without header) using HXTT driver.
It is properly fetching records with the single charecter delimiter but not working with
multiple charector delimiter.
We are having delimited csv file without having headers which contains "*|~"
charector as a delimiter.
We have written below schema file for it.
Schema file
========
[target.csv]
ColNameHeader=False
Format=Delimited("~|*")
CharacterSet=ANSI
COL1=first_name varchar
COL2=last_name varchar
COL3=city varchar
Data
====
Sachin~|*Tendulkar~|*Mumbai
Saurav~|*Ganguly~|*Kolkata
so when we perform select operation on it "|*" character comes in the result
like below
|\|*Sachin|\|*Tendulkar|\|*Mumbai|
Datasource in standalone-teiid.xml
=======================
<datasource jndi-name="java:/FileDS20200713211347079"
pool-name="FileDS20200713211347079">
<connection-url>jdbc:text:sftp://abc:efg@10.10.10.10:22/home/abc/target.csv?odbcSchemaFile=target.sch;fileExtension=csv;delayedClose=0;refreshInterval=0</connection-url>
<driver>textfile</driver>
<pool>
<max-pool-size>20</max-pool-size>
</pool>
</datasource>
As I read in some blogs, they suggested to change the "*CharacterSet*" from
*ANSI* to *UTF-8* in schema file(.sch file).
So I tried to change it to UTF-8 as well as *65001* still getting same result
*refered link
- [https://www.ibm.com/support/pages/special-characters-data-imported-through-odbc-import-using-microsoft-text-driver-are-not-displayed-correctly-admin-client]*
we also tried to add";charSet=UTF-8" in datasource URL but still its giving
same kind of result.
Could you please suggest some solution for this issue.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)