[teiid-issues] [JBoss JIRA] (TEIID-4928) Couchbase - NAMEINSOURCE required for all the columns and tables

Kylin Soong (JIRA) issues at jboss.org
Thu May 25 01:21:00 EDT 2017


    [ https://issues.jboss.org/browse/TEIID-4928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411682#comment-13411682 ] 

Kylin Soong commented on TEIID-4928:
------------------------------------

Hi [~jdurani]
Thanks for your contribute on Couchbase connector. I would like to add additional comments to account for some of your questions, I hope this can help you.

> Why need NAMEINSOURCE for each of columns and tables?

First, you need to know the Logical Hierarchy based data stracture of a Couchbase cluster, it looks
{code}
Namespaces
      └── Keyspaces
             └──Documents
{code}
the documents under keyspaces are schemaless, which can contain different structure of documents, these documents also main contain multiple dimension of nested json, or nested arrays or arrays of differently-typed elements.

The path expressions get retrive the value of a attr, eg, the below path is to retrive of the value of p_asia, which under the nested document geo,
{code}
namespace:keyspace/`travel-sample`.geo.`p_asia`
{code}
the below path is to retrive of the value of multiple dimension of nested array
{code}
namespace:keyspace/`travel-sample`.a[0][0][1][1]
{code}

The Couchbase don't like other document database, it not supply any metadata, teiid load metadata will create metadata aotomaticly,  because each of columns represent a attr in a json document, for a column in teiid table to map to this kinds of attr in document, the NAMEINSOURCE is necessary.

The table name map to keyspace name in couchbase, same reason, due to lack of couchbase based native metadata, a TypeNameList properties are used to define table name, eg, if you define 
{code}
<property name="importer.typeNameList" value="`default`:`type`"/>
{code}
then teiid load all documents under Keyspace default, if a document has a attribute `type`, then the referenced value will be treated as table name. So for each table a NAMEINSOURCE also is necessary, we need use it to real Keyspace.

Another reason for a table need a NAMEINSOURCE is array table, any array existed in a document will be map to a separate table, as above path expressions example, it's a 4 dimension array, the generated tables looks as [1], for this occasion, the NAMEINSOURCE is also a path expression and necessary.

> How to avoid pass null

Base on the design so far, not try to design souce model by yourself, use CouchbaseMetadataProcessor generated tables/procedures. If you want to design souce model by yourself, you should strictly base on the logic of Generating Schema sesion in [2].

Feel free to update the document, if you have good understand of design souche model, you can add a "Design Schema" section in [2].

> Does sampleSize make sense?

Due to CouchbaseMetadataProcessor fetch all documents in different Keyspaces, so a sampleSize is use to control maximum number of documents should be fetch per keyspace. 

Actually, some other Couchbase JDBC provider(like Simba, Talend) also use the same logic, use a sampleSize to avoid select all, there is not good ways to resolve schemaless, zero-metadata documents database.

Kylin

[1] https://github.com/teiid/teiid/blob/master/connectors/couchbase/translator-couchbase/src/test/resources/nestedArray.expected
[2] https://teiid.gitbooks.io/documents/content/reference/couchbase_translator.html


> Couchbase - NAMEINSOURCE required for all the columns and tables
> ----------------------------------------------------------------
>
>                 Key: TEIID-4928
>                 URL: https://issues.jboss.org/browse/TEIID-4928
>             Project: Teiid
>          Issue Type: Bug
>          Components: Misc. Connectors
>    Affects Versions: 9.3
>            Reporter: Juraj Duráni
>            Assignee: Kylin Soong
>
> Option *NAMEINSOURCE* is de facto required for all the columns and tables. If it is not present then:
> # column name in source query is not enclosed in back quotes - e.g. *`$cb_t1`.ShortValue* instead of *`$cb_t1`.`ShortValue`*
> # name of the table is not added to the source query - e.g. *SELECT ... FROM null `$cb_t1` LET ... WHERE ...* instead of *SELECT ... FROM `smalla` `$cb_t1` LET ... WHERE ...*
> This should work OOB without need to add NAMEINSOURCE option. Teiid should automatically translate column name from e.g. *MyColumn* to *`MyColumn`* if option is not set. Same with name of the table.
> In case of table I think this is more serious as it does not even try name of the table but supplies *null* to the query



--
This message was sent by Atlassian JIRA
(v7.2.3#72005)



More information about the teiid-issues mailing list