[teiid-issues] [JBoss JIRA] (TEIID-3621) HBase translator - UPDATE statement requires primary key

Juraj Duráni (JIRA) issues at jboss.org
Thu Aug 13 05:01:02 EDT 2015


    [ https://issues.jboss.org/browse/TEIID-3621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13097922#comment-13097922 ] 

Juraj Duráni commented on TEIID-3621:
-------------------------------------

The primary key in Phoenix is related to 'ROW' in the HBase. In my opinion, both of them are unique [1]. It makes sense to have the PK defined in a VDB, so the translator is able to map the rows properly. But the translator cannot assume that PK defined in VDB is same as the one in the Phoenix (ROW column in the HBase respectively) [2].
So my little conclusion:
- Yes, the HBase translator needs to know what is the PK in the Phoenix's table so it can properly perform insert and update - Phoenix driver requires to include a PK in UPSERT
- No, the HBase translator cannot assume that PK in the VDB has been defined on very same column as in the Phoenix. We should do it more defensively and "ask" the user to explicitly define what is the PK in the source table.

[1]
*hbase shell:*
{code}
hbase(main):002:0> create 'test2', 'cf'
0 row(s) in 1.2970 seconds
=> Hbase::Table - test2
hbase(main):003:0> put 'test2', '1', 'cf:1', 'a'
0 row(s) in 0.0600 seconds
hbase(main):004:0> scan 'test2'
ROW                                                   COLUMN+CELL                                                                                                                                                  
 1                                                    column=cf:1, timestamp=1439453357007, value=a                                                                                                                
1 row(s) in 0.0180 seconds
hbase(main):005:0> put 'test2', '1', 'cf:2', 'b'
0 row(s) in 0.0060 seconds
hbase(main):006:0> scan 'test2'
ROW                                                   COLUMN+CELL                                                                                                                                                  
 1                                                    column=cf:1, timestamp=1439453357007, value=a                                                                                                                
 1                                                    column=cf:2, timestamp=1439453371736, value=b                                                                                                                
1 row(s) in 0.0200 seconds
hbase(main):007:0> put 'test2', '1', 'cf:2', 'c'
0 row(s) in 0.0040 seconds
hbase(main):008:0> scan 'test2'
ROW                                                   COLUMN+CELL                                                                                                                                                  
 1                                                    column=cf:1, timestamp=1439453357007, value=a                                                                                                                
 1                                                    column=cf:2, timestamp=1439453393714, value=c                                                                                                                
1 row(s) in 0.0230 seconds
{code}

[2]
*Phoenix:*
{code:sql}create table users(id integer primary key, username varchar(10)){code}
*VDB:*
{code:sql}create foreign table users(id integer, username string primary key){code}

> HBase translator - UPDATE statement requires primary key
> --------------------------------------------------------
>
>                 Key: TEIID-3621
>                 URL: https://issues.jboss.org/browse/TEIID-3621
>             Project: Teiid
>          Issue Type: Bug
>    Affects Versions: 8.7.1.6_2
>         Environment: Hbase: 1.1.1
> Phoenix: 4.5.0-HBase-1.1
>            Reporter: Juraj Duráni
>            Assignee: Kylin Soong
>
> The HBase translator requires table to have a primary key defined. Is the PK really needed? If the table has no PK defined, then all columns are PK. E.g. query *UPDATE hbase.SmallA SET StringNum = '555' WHERE hbase.SmallA.StringNum IS NULL* is translated as *UPSERT INTO smalla (stringnum, intkey) SELECT '555', smalla.intkey FROM smalla WHERE smalla.stringnum IS NULL* (intkey is PK). Teiid can simply add all columns (except those defined in 'SET'). 
> Yes, I know that HBase requires the PK to be defined, but what happen if a user decide to change PK in VDB [1]? It could be a problem whether PK is in VDB defined or not. 
> I suggest to add a hbase-translator-specific execution property which define PK in the source table and remove AssertionError [2].
> [1]
> *HBase table:* create table mytable(id integer primary key, nickname varchar(1))
> *Teiid table:* create table mytable(id integer, username varchar(1) primary key)
> Both, id and username, are valid PK (artificial/natural).
> [2]
> https://github.com/teiid/teiid/blob/master/connectors/translator-hbase/src/main/java/org/teiid/translator/hbase/HBaseSQLConversionVisitor.java#L72



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)



More information about the teiid-issues mailing list