[
https://issues.jboss.org/browse/TEIID-4928?page=com.atlassian.jira.plugin...
]
Kylin Soong commented on TEIID-4928:
------------------------------------
Hi [~jdurani]
Thanks for your contribute on Couchbase connector. I would like to add additional comments
to account for some of your questions, I hope this can help you.
Why need NAMEINSOURCE for each of columns and tables?
First, you need to know the Logical Hierarchy based data stracture of a Couchbase cluster,
it looks
{code}
Namespaces
└── Keyspaces
└──Documents
{code}
the documents under keyspaces are schemaless, which can contain different structure of
documents, these documents also main contain multiple dimension of nested json, or nested
arrays or arrays of differently-typed elements.
The path expressions get retrive the value of a attr, eg, the below path is to retrive of
the value of p_asia, which under the nested document geo,
{code}
namespace:keyspace/`travel-sample`.geo.`p_asia`
{code}
the below path is to retrive of the value of multiple dimension of nested array
{code}
namespace:keyspace/`travel-sample`.a[0][0][1][1]
{code}
The Couchbase don't like other document database, it not supply any metadata, teiid
load metadata will create metadata aotomaticly, because each of columns represent a attr
in a json document, for a column in teiid table to map to this kinds of attr in document,
the NAMEINSOURCE is necessary.
The table name map to keyspace name in couchbase, same reason, due to lack of couchbase
based native metadata, a TypeNameList properties are used to define table name, eg, if you
define
{code}
<property name="importer.typeNameList"
value="`default`:`type`"/>
{code}
then teiid load all documents under Keyspace default, if a document has a attribute
`type`, then the referenced value will be treated as table name. So for each table a
NAMEINSOURCE also is necessary, we need use it to real Keyspace.
Another reason for a table need a NAMEINSOURCE is array table, any array existed in a
document will be map to a separate table, as above path expressions example, it's a 4
dimension array, the generated tables looks as [1], for this occasion, the NAMEINSOURCE is
also a path expression and necessary.
How to avoid pass null
Base on the design so far, not try to design souce model by yourself, use
CouchbaseMetadataProcessor generated tables/procedures. If you want to design souce model
by yourself, you should strictly base on the logic of Generating Schema sesion in [2].
Feel free to update the document, if you have good understand of design souche model, you
can add a "Design Schema" section in [2].
Does sampleSize make sense?
Due to CouchbaseMetadataProcessor fetch all documents in different Keyspaces, so a
sampleSize is use to control maximum number of documents should be fetch per keyspace.
Actually, some other Couchbase JDBC provider(like Simba, Talend) also use the same logic,
use a sampleSize to avoid select all, there is not good ways to resolve schemaless,
zero-metadata documents database.
Kylin
[1]
https://github.com/teiid/teiid/blob/master/connectors/couchbase/translato...
[2]
https://teiid.gitbooks.io/documents/content/reference/couchbase_translato...
Couchbase - NAMEINSOURCE required for all the columns and tables
----------------------------------------------------------------
Key: TEIID-4928
URL:
https://issues.jboss.org/browse/TEIID-4928
Project: Teiid
Issue Type: Bug
Components: Misc. Connectors
Affects Versions: 9.3
Reporter: Juraj Duráni
Assignee: Kylin Soong
Option *NAMEINSOURCE* is de facto required for all the columns and tables. If it is not
present then:
# column name in source query is not enclosed in back quotes - e.g. *`$cb_t1`.ShortValue*
instead of *`$cb_t1`.`ShortValue`*
# name of the table is not added to the source query - e.g. *SELECT ... FROM null
`$cb_t1` LET ... WHERE ...* instead of *SELECT ... FROM `smalla` `$cb_t1` LET ... WHERE
...*
This should work OOB without need to add NAMEINSOURCE option. Teiid should automatically
translate column name from e.g. *MyColumn* to *`MyColumn`* if option is not set. Same with
name of the table.
In case of table I think this is more serious as it does not even try name of the table
but supplies *null* to the query
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)