[JBoss JIRA] (TEIID-3616) HBase translator - NPE if date value is 'null'
by Steven Hawkins (JIRA)
[ https://issues.jboss.org/browse/TEIID-3616?page=com.atlassian.jira.plugin... ]
Steven Hawkins resolved TEIID-3616.
-----------------------------------
Fix Version/s: 8.12
Resolution: Done
Overrode the default behavior and added a doc note to not use the databasetimezone property with the translator.
> HBase translator - NPE if date value is 'null'
> ----------------------------------------------
>
> Key: TEIID-3616
> URL: https://issues.jboss.org/browse/TEIID-3616
> Project: Teiid
> Issue Type: Bug
> Affects Versions: 8.7.1.6_2
> Environment: Hbase: 1.1.1
> Phoenix: 4.5.0-HBase-1.1
> Reporter: Juraj Duráni
> Assignee: Steven Hawkins
> Fix For: 8.12
>
> Attachments: log
>
>
> If the source table in HBase contains a 'NULL' date value, Teiid throws an NPE. I did not encounter any NPE using org.apache.phoenix.jdbc.PhoenixDriver [1]. Other data types seem to be OK too.
> I have tried integer, char, varchar, float, double, tinyint, smallint, bigint, decimal, varbinary, boolean, time, date, timestamp.
> [1]
> Connection con = new org.apache.phoenix.jdbc.PhoenixDriver().connect("jdbc:phoenix:localhost", new Properties())
> ResultSet rs = con.createStatement().executeQuery("select datevalue from smalla");
> while(rs.next()){
> System.out.println(rs.getDate(1));
> }
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 7 months
[JBoss JIRA] (TEIID-3621) HBase translator - UPDATE statement requires primary key
by Kylin Soong (JIRA)
[ https://issues.jboss.org/browse/TEIID-3621?page=com.atlassian.jira.plugin... ]
Kylin Soong commented on TEIID-3621:
------------------------------------
> Yes, the HBase translator needs to know what is the PK in the Phoenix's table so it can properly perform insert and update - Phoenix driver requires to include a PK in UPSERT
This is what we did previously, If no primary key defined in VDB map to Phoenix table's primary key, AssertionError throw as [1].
> No, the HBase translator cannot assume that PK in the VDB has been defined on very same column as in the Phoenix. We should do it more defensively and "ask" the user to explicitly define what is the PK in the source table.
Phoenix is a SQL layer on top of HBase,
if create table map to HBase table([2] is a example), we can know PK in the processing of mapping.
If we create a new table, it's easy to know which column is PK.
[1] https://github.com/teiid/teiid/blob/master/connectors/translator-hbase/sr...
[2] https://github.com/teiid/teiid-quickstarts/tree/master/hbase-as-a-datasource
> HBase translator - UPDATE statement requires primary key
> --------------------------------------------------------
>
> Key: TEIID-3621
> URL: https://issues.jboss.org/browse/TEIID-3621
> Project: Teiid
> Issue Type: Bug
> Affects Versions: 8.7.1.6_2
> Environment: Hbase: 1.1.1
> Phoenix: 4.5.0-HBase-1.1
> Reporter: Juraj Duráni
> Assignee: Kylin Soong
>
> The HBase translator requires table to have a primary key defined. Is the PK really needed? If the table has no PK defined, then all columns are PK. E.g. query *UPDATE hbase.SmallA SET StringNum = '555' WHERE hbase.SmallA.StringNum IS NULL* is translated as *UPSERT INTO smalla (stringnum, intkey) SELECT '555', smalla.intkey FROM smalla WHERE smalla.stringnum IS NULL* (intkey is PK). Teiid can simply add all columns (except those defined in 'SET').
> Yes, I know that HBase requires the PK to be defined, but what happen if a user decide to change PK in VDB [1]? It could be a problem whether PK is in VDB defined or not.
> I suggest to add a hbase-translator-specific execution property which define PK in the source table and remove AssertionError [2].
> [1]
> *HBase table:* create table mytable(id integer primary key, nickname varchar(1))
> *Teiid table:* create table mytable(id integer, username varchar(1) primary key)
> Both, id and username, are valid PK (artificial/natural).
> [2]
> https://github.com/teiid/teiid/blob/master/connectors/translator-hbase/sr...
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 7 months
[JBoss JIRA] (TEIID-3621) HBase translator - UPDATE statement requires primary key
by Steven Hawkins (JIRA)
[ https://issues.jboss.org/browse/TEIID-3621?page=com.atlassian.jira.plugin... ]
Steven Hawkins commented on TEIID-3621:
---------------------------------------
> But the translator cannot assume that PK defined in VDB is same as the one in the Phoenix (ROW column in the HBase respectively)
It can if that is an explicit in the documentation of the translator - and will be the case if automatically importing the metadata. There are lots of scenarios across all of translators where users can craft source metadata that isn't supported by the translator (invalid datatypes, even missing columns, etc.). In this case if you want a different primary key, then a view should be introduced or you could just mark the username column with a unique constraint.
The caveat with removing the restriction would be to test if there is additional overhead with specifying all columns with the upsert.
> HBase translator - UPDATE statement requires primary key
> --------------------------------------------------------
>
> Key: TEIID-3621
> URL: https://issues.jboss.org/browse/TEIID-3621
> Project: Teiid
> Issue Type: Bug
> Affects Versions: 8.7.1.6_2
> Environment: Hbase: 1.1.1
> Phoenix: 4.5.0-HBase-1.1
> Reporter: Juraj Duráni
> Assignee: Kylin Soong
>
> The HBase translator requires table to have a primary key defined. Is the PK really needed? If the table has no PK defined, then all columns are PK. E.g. query *UPDATE hbase.SmallA SET StringNum = '555' WHERE hbase.SmallA.StringNum IS NULL* is translated as *UPSERT INTO smalla (stringnum, intkey) SELECT '555', smalla.intkey FROM smalla WHERE smalla.stringnum IS NULL* (intkey is PK). Teiid can simply add all columns (except those defined in 'SET').
> Yes, I know that HBase requires the PK to be defined, but what happen if a user decide to change PK in VDB [1]? It could be a problem whether PK is in VDB defined or not.
> I suggest to add a hbase-translator-specific execution property which define PK in the source table and remove AssertionError [2].
> [1]
> *HBase table:* create table mytable(id integer primary key, nickname varchar(1))
> *Teiid table:* create table mytable(id integer, username varchar(1) primary key)
> Both, id and username, are valid PK (artificial/natural).
> [2]
> https://github.com/teiid/teiid/blob/master/connectors/translator-hbase/sr...
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 7 months
[JBoss JIRA] (TEIID-3621) HBase translator - UPDATE statement requires primary key
by Juraj Duráni (JIRA)
[ https://issues.jboss.org/browse/TEIID-3621?page=com.atlassian.jira.plugin... ]
Juraj Duráni commented on TEIID-3621:
-------------------------------------
The primary key in Phoenix is related to 'ROW' in the HBase. In my opinion, both of them are unique [1]. It makes sense to have the PK defined in a VDB, so the translator is able to map the rows properly. But the translator cannot assume that PK defined in VDB is same as the one in the Phoenix (ROW column in the HBase respectively) [2].
So my little conclusion:
- Yes, the HBase translator needs to know what is the PK in the Phoenix's table so it can properly perform insert and update - Phoenix driver requires to include a PK in UPSERT
- No, the HBase translator cannot assume that PK in the VDB has been defined on very same column as in the Phoenix. We should do it more defensively and "ask" the user to explicitly define what is the PK in the source table.
[1]
*hbase shell:*
{code}
hbase(main):002:0> create 'test2', 'cf'
0 row(s) in 1.2970 seconds
=> Hbase::Table - test2
hbase(main):003:0> put 'test2', '1', 'cf:1', 'a'
0 row(s) in 0.0600 seconds
hbase(main):004:0> scan 'test2'
ROW COLUMN+CELL
1 column=cf:1, timestamp=1439453357007, value=a
1 row(s) in 0.0180 seconds
hbase(main):005:0> put 'test2', '1', 'cf:2', 'b'
0 row(s) in 0.0060 seconds
hbase(main):006:0> scan 'test2'
ROW COLUMN+CELL
1 column=cf:1, timestamp=1439453357007, value=a
1 column=cf:2, timestamp=1439453371736, value=b
1 row(s) in 0.0200 seconds
hbase(main):007:0> put 'test2', '1', 'cf:2', 'c'
0 row(s) in 0.0040 seconds
hbase(main):008:0> scan 'test2'
ROW COLUMN+CELL
1 column=cf:1, timestamp=1439453357007, value=a
1 column=cf:2, timestamp=1439453393714, value=c
1 row(s) in 0.0230 seconds
{code}
[2]
*Phoenix:*
{code:sql}create table users(id integer primary key, username varchar(10)){code}
*VDB:*
{code:sql}create foreign table users(id integer, username string primary key){code}
> HBase translator - UPDATE statement requires primary key
> --------------------------------------------------------
>
> Key: TEIID-3621
> URL: https://issues.jboss.org/browse/TEIID-3621
> Project: Teiid
> Issue Type: Bug
> Affects Versions: 8.7.1.6_2
> Environment: Hbase: 1.1.1
> Phoenix: 4.5.0-HBase-1.1
> Reporter: Juraj Duráni
> Assignee: Kylin Soong
>
> The HBase translator requires table to have a primary key defined. Is the PK really needed? If the table has no PK defined, then all columns are PK. E.g. query *UPDATE hbase.SmallA SET StringNum = '555' WHERE hbase.SmallA.StringNum IS NULL* is translated as *UPSERT INTO smalla (stringnum, intkey) SELECT '555', smalla.intkey FROM smalla WHERE smalla.stringnum IS NULL* (intkey is PK). Teiid can simply add all columns (except those defined in 'SET').
> Yes, I know that HBase requires the PK to be defined, but what happen if a user decide to change PK in VDB [1]? It could be a problem whether PK is in VDB defined or not.
> I suggest to add a hbase-translator-specific execution property which define PK in the source table and remove AssertionError [2].
> [1]
> *HBase table:* create table mytable(id integer primary key, nickname varchar(1))
> *Teiid table:* create table mytable(id integer, username varchar(1) primary key)
> Both, id and username, are valid PK (artificial/natural).
> [2]
> https://github.com/teiid/teiid/blob/master/connectors/translator-hbase/sr...
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 7 months
[JBoss JIRA] (TEIID-3621) HBase translator - UPDATE statement requires primary key
by Kylin Soong (JIRA)
[ https://issues.jboss.org/browse/TEIID-3621?page=com.atlassian.jira.plugin... ]
Kylin Soong commented on TEIID-3621:
------------------------------------
For each Table in Phoenix, it have to contain a primary key, the PK are indexed and used to map with HBase table row id, but it not related with unique constraint, what this is a little bit difference with PK in RDBMS. Also each UPSERT relate with a PK column.
Do we need define a pk to map to Phoenix table's pk? I removed the pk validation logic, probably we need add a note to document that HBase translator are limited in INSERT/UPDATE.
> HBase translator - UPDATE statement requires primary key
> --------------------------------------------------------
>
> Key: TEIID-3621
> URL: https://issues.jboss.org/browse/TEIID-3621
> Project: Teiid
> Issue Type: Bug
> Affects Versions: 8.7.1.6_2
> Environment: Hbase: 1.1.1
> Phoenix: 4.5.0-HBase-1.1
> Reporter: Juraj Duráni
> Assignee: Kylin Soong
>
> The HBase translator requires table to have a primary key defined. Is the PK really needed? If the table has no PK defined, then all columns are PK. E.g. query *UPDATE hbase.SmallA SET StringNum = '555' WHERE hbase.SmallA.StringNum IS NULL* is translated as *UPSERT INTO smalla (stringnum, intkey) SELECT '555', smalla.intkey FROM smalla WHERE smalla.stringnum IS NULL* (intkey is PK). Teiid can simply add all columns (except those defined in 'SET').
> Yes, I know that HBase requires the PK to be defined, but what happen if a user decide to change PK in VDB [1]? It could be a problem whether PK is in VDB defined or not.
> I suggest to add a hbase-translator-specific execution property which define PK in the source table and remove AssertionError [2].
> [1]
> *HBase table:* create table mytable(id integer primary key, nickname varchar(1))
> *Teiid table:* create table mytable(id integer, username varchar(1) primary key)
> Both, id and username, are valid PK (artificial/natural).
> [2]
> https://github.com/teiid/teiid/blob/master/connectors/translator-hbase/sr...
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 7 months
[JBoss JIRA] (TEIID-3616) HBase translator - NPE if date value is 'null'
by Steven Hawkins (JIRA)
[ https://issues.jboss.org/browse/TEIID-3616?page=com.atlassian.jira.plugin... ]
Work on TEIID-3616 started by Steven Hawkins.
---------------------------------------------
> HBase translator - NPE if date value is 'null'
> ----------------------------------------------
>
> Key: TEIID-3616
> URL: https://issues.jboss.org/browse/TEIID-3616
> Project: Teiid
> Issue Type: Bug
> Affects Versions: 8.7.1.6_2
> Environment: Hbase: 1.1.1
> Phoenix: 4.5.0-HBase-1.1
> Reporter: Juraj Duráni
> Assignee: Steven Hawkins
> Attachments: log
>
>
> If the source table in HBase contains a 'NULL' date value, Teiid throws an NPE. I did not encounter any NPE using org.apache.phoenix.jdbc.PhoenixDriver [1]. Other data types seem to be OK too.
> I have tried integer, char, varchar, float, double, tinyint, smallint, bigint, decimal, varbinary, boolean, time, date, timestamp.
> [1]
> Connection con = new org.apache.phoenix.jdbc.PhoenixDriver().connect("jdbc:phoenix:localhost", new Properties())
> ResultSet rs = con.createStatement().executeQuery("select datevalue from smalla");
> while(rs.next()){
> System.out.println(rs.getDate(1));
> }
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 7 months
[JBoss JIRA] (TEIID-3621) HBase translator - UPDATE statement requires primary key
by Steven Hawkins (JIRA)
[ https://issues.jboss.org/browse/TEIID-3621?page=com.atlassian.jira.plugin... ]
Steven Hawkins commented on TEIID-3621:
---------------------------------------
Kylin, what was the need for the pk restriction on update?
> I suggest to add a hbase-translator-specific execution property which define PK in the source table and remove AssertionError [2].
I don't follow this suggestion. Let's first determine why the restriction is there.
> HBase translator - UPDATE statement requires primary key
> --------------------------------------------------------
>
> Key: TEIID-3621
> URL: https://issues.jboss.org/browse/TEIID-3621
> Project: Teiid
> Issue Type: Bug
> Affects Versions: 8.7.1.6_2
> Environment: Hbase: 1.1.1
> Phoenix: 4.5.0-HBase-1.1
> Reporter: Juraj Duráni
> Assignee: Kylin Soong
>
> The HBase translator requires table to have a primary key defined. Is the PK really needed? If the table has no PK defined, then all columns are PK. E.g. query *UPDATE hbase.SmallA SET StringNum = '555' WHERE hbase.SmallA.StringNum IS NULL* is translated as *UPSERT INTO smalla (stringnum, intkey) SELECT '555', smalla.intkey FROM smalla WHERE smalla.stringnum IS NULL* (intkey is PK). Teiid can simply add all columns (except those defined in 'SET').
> Yes, I know that HBase requires the PK to be defined, but what happen if a user decide to change PK in VDB [1]? It could be a problem whether PK is in VDB defined or not.
> I suggest to add a hbase-translator-specific execution property which define PK in the source table and remove AssertionError [2].
> [1]
> *HBase table:* create table mytable(id integer primary key, nickname varchar(1))
> *Teiid table:* create table mytable(id integer, username varchar(1) primary key)
> Both, id and username, are valid PK (artificial/natural).
> [2]
> https://github.com/teiid/teiid/blob/master/connectors/translator-hbase/sr...
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 7 months
[JBoss JIRA] (TEIID-3621) HBase translator - UPDATE statement requires primary key
by Steven Hawkins (JIRA)
[ https://issues.jboss.org/browse/TEIID-3621?page=com.atlassian.jira.plugin... ]
Steven Hawkins reassigned TEIID-3621:
-------------------------------------
Assignee: Kylin Soong (was: Steven Hawkins)
> HBase translator - UPDATE statement requires primary key
> --------------------------------------------------------
>
> Key: TEIID-3621
> URL: https://issues.jboss.org/browse/TEIID-3621
> Project: Teiid
> Issue Type: Bug
> Affects Versions: 8.7.1.6_2
> Environment: Hbase: 1.1.1
> Phoenix: 4.5.0-HBase-1.1
> Reporter: Juraj Duráni
> Assignee: Kylin Soong
>
> The HBase translator requires table to have a primary key defined. Is the PK really needed? If the table has no PK defined, then all columns are PK. E.g. query *UPDATE hbase.SmallA SET StringNum = '555' WHERE hbase.SmallA.StringNum IS NULL* is translated as *UPSERT INTO smalla (stringnum, intkey) SELECT '555', smalla.intkey FROM smalla WHERE smalla.stringnum IS NULL* (intkey is PK). Teiid can simply add all columns (except those defined in 'SET').
> Yes, I know that HBase requires the PK to be defined, but what happen if a user decide to change PK in VDB [1]? It could be a problem whether PK is in VDB defined or not.
> I suggest to add a hbase-translator-specific execution property which define PK in the source table and remove AssertionError [2].
> [1]
> *HBase table:* create table mytable(id integer primary key, nickname varchar(1))
> *Teiid table:* create table mytable(id integer, username varchar(1) primary key)
> Both, id and username, are valid PK (artificial/natural).
> [2]
> https://github.com/teiid/teiid/blob/master/connectors/translator-hbase/sr...
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 7 months
[JBoss JIRA] (TEIID-3622) HBase translator - INSERT could rewrite the data
by Steven Hawkins (JIRA)
[ https://issues.jboss.org/browse/TEIID-3622?page=com.atlassian.jira.plugin... ]
Steven Hawkins commented on TEIID-3622:
---------------------------------------
Teiid does support a limited merge statement, which is similar to an upsert. We would have to update the engine/translators to support pushdown to fully address this issue.
Otherwise we either should just document the upsert behavior (I believe mongodb is similar) or we'd have to just not support updates (although delete would be fine, we can't specify just supporting it currently).
> HBase translator - INSERT could rewrite the data
> ------------------------------------------------
>
> Key: TEIID-3622
> URL: https://issues.jboss.org/browse/TEIID-3622
> Project: Teiid
> Issue Type: Bug
> Affects Versions: 8.7.1.6_2
> Environment: Hbase: 1.1.1
> Phoenix: 4.5.0-HBase-1.1
> Reporter: Juraj Duráni
> Assignee: Steven Hawkins
>
> The HBase translator translates INSERT as UPSERT, which is an "alias" for both INSERT and UPDATE statement. It means, if user issues same INSERT statement twice, no exception is thrown [1]. I expect that [2] could rewrite the data.
> *Additional note*: I was not able to verify my assumption because of https://issues.jboss.org/browse/TEIID-3619
> [1]
> INSERT INTO smalla (intkey) VALUES (55) is translated as UPSERT INTO smalla (intkey) VALUES (55)
> http://phoenix.apache.org/language/index.html#upsert_values
> [2]
> INSERT INTO smalla (intkey, name) VALUES (1, 'name1')
> INSERT INTO smalla (intkey, name) VALUES (1, 'name2')
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 7 months