[JBoss JIRA] (TEIID-2309) Add support for "conformed" tables in Teiid
by Steven Hawkins (JIRA)
[ https://issues.jboss.org/browse/TEIID-2309?page=com.atlassian.jira.plugin... ]
Steven Hawkins reassigned TEIID-2309:
-------------------------------------
Assignee: Steven Hawkins
> Add support for "conformed" tables in Teiid
> -------------------------------------------
>
> Key: TEIID-2309
> URL: https://issues.jboss.org/browse/TEIID-2309
> Project: Teiid
> Issue Type: Feature Request
> Components: Query Engine
> Reporter: Debbie Steigner
> Assignee: Steven Hawkins
> Fix For: 8.6
>
>
> Teiid would support tables from different data sources being marked as "conformed", meaning they are the same (or perhaps a different name). When optimising a query, it would take the conformity into account and choose the appropriate copy of the table (presumably one in the same database as other tables in the query, if available). I would not regard it as a problem if Teiid *required* the dimensions to be strictly the same as opposed to permitting subsets, though as with so many areas, it would be up to the user to ensure this was really true: I would not expect the engine to do anything to verify that the tables really were conformed.
> Usecase:
> In Data Warehousing, it is relatively common to have multiple copies of the same dimensions spread over multiple Data Warehouses or Marts, or in the same Data Warehouse when associated with different Fact Tables. If these copies are either identical or strict subsets of an idealised dimension (and, by extension, share *exactly* the same naming and structure), then they may be said to be "conformed". It is expected that the dimension includes at least the values required to support the facts in the database in which it occurs or the Fact Table to which it is paired.
> Example:
>
> Source S1:
>
> BIGBIGBIG (millions of rows)
> bigkey
> ccy
> other_stuff
>
> CURRENCY (100s of rows) let's call it S1_CCY if we need to distinguish
> ccy
> ccy_name
>
>
> Source S2:
>
> BIGGER (millions of rows)
> biggerkey
> bigkey
> ccy
> more_stuff
>
> CURRENCY (100s of rows) similarly, S2_CCY
> ccy
> ccy_name
>
>
> When executing:
>
> SELECT B.*
> FROM BIGBIGBIG B,
> CURRENCY CCY
> WHERE B.ccy = CCY.ccy
> AND CCY.ccy_name LIKE "%DOLLAR%"
>
> Then it is clearly advantageous to use the copy of CURRENCY in S1 and re-write the query using S1_CCY. In this situation, federation is eliminated completely.
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 3 months
[JBoss JIRA] (TEIID-2309) Add support for "conformed" tables in Teiid
by Steven Hawkins (JIRA)
[ https://issues.jboss.org/browse/TEIID-2309?page=com.atlassian.jira.plugin... ]
Work on TEIID-2309 started by Steven Hawkins.
> Add support for "conformed" tables in Teiid
> -------------------------------------------
>
> Key: TEIID-2309
> URL: https://issues.jboss.org/browse/TEIID-2309
> Project: Teiid
> Issue Type: Feature Request
> Components: Query Engine
> Reporter: Debbie Steigner
> Assignee: Steven Hawkins
> Fix For: 8.6
>
>
> Teiid would support tables from different data sources being marked as "conformed", meaning they are the same (or perhaps a different name). When optimising a query, it would take the conformity into account and choose the appropriate copy of the table (presumably one in the same database as other tables in the query, if available). I would not regard it as a problem if Teiid *required* the dimensions to be strictly the same as opposed to permitting subsets, though as with so many areas, it would be up to the user to ensure this was really true: I would not expect the engine to do anything to verify that the tables really were conformed.
> Usecase:
> In Data Warehousing, it is relatively common to have multiple copies of the same dimensions spread over multiple Data Warehouses or Marts, or in the same Data Warehouse when associated with different Fact Tables. If these copies are either identical or strict subsets of an idealised dimension (and, by extension, share *exactly* the same naming and structure), then they may be said to be "conformed". It is expected that the dimension includes at least the values required to support the facts in the database in which it occurs or the Fact Table to which it is paired.
> Example:
>
> Source S1:
>
> BIGBIGBIG (millions of rows)
> bigkey
> ccy
> other_stuff
>
> CURRENCY (100s of rows) let's call it S1_CCY if we need to distinguish
> ccy
> ccy_name
>
>
> Source S2:
>
> BIGGER (millions of rows)
> biggerkey
> bigkey
> ccy
> more_stuff
>
> CURRENCY (100s of rows) similarly, S2_CCY
> ccy
> ccy_name
>
>
> When executing:
>
> SELECT B.*
> FROM BIGBIGBIG B,
> CURRENCY CCY
> WHERE B.ccy = CCY.ccy
> AND CCY.ccy_name LIKE "%DOLLAR%"
>
> Then it is clearly advantageous to use the copy of CURRENCY in S1 and re-write the query using S1_CCY. In this situation, federation is eliminated completely.
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 3 months
[JBoss JIRA] (TEIID-2671) Hive Metadata load incorrectly resolves data types
by Ramesh Reddy (JIRA)
[ https://issues.jboss.org/browse/TEIID-2671?page=com.atlassian.jira.plugin... ]
Ramesh Reddy resolved TEIID-2671.
---------------------------------
Resolution: Done
1) Hive 0.11 driver can return the types names with spaces appended to it, to trimming them before the checking for defined type is required to detect correct runtime type
2) 0.71 driver when originally the translator developed it looks like in resultset.getTimestamp(index, cal) method cal parameter is ignored, and in 0.11 this method is written as not supported
3) correction for above was required to retrive the timestamp value correctly
4) Since we are recommending 0.11, updated the module.xml with correct version numbers in the documents section
5) Hive also released Hive2 driver where the driver class is org.apache.hive.jdbc.HiveDriver, provided ds example for this driver in docs section
6) tested with both drivers against the local and gss database
> Hive Metadata load incorrectly resolves data types
> --------------------------------------------------
>
> Key: TEIID-2671
> URL: https://issues.jboss.org/browse/TEIID-2671
> Project: Teiid
> Issue Type: Bug
> Affects Versions: 8.4
> Reporter: Filip Nguyen
> Assignee: Ramesh Reddy
> Priority: Blocker
> Fix For: 8.4.1, 8.6
>
>
> The metadata load uses DESCRIBE keyword which tires to retrieve type name in String runtimeType = getRuntimeType(type);
> However, with Hive 0.11 the type name is returned with padding spaces. Hence all the data types are resolved to default java.lang.String for Hive Translator.
> This relates to
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 3 months
[JBoss JIRA] (TEIID-2309) Add support for "conformed" tables in Teiid
by Steven Hawkins (JIRA)
[ https://issues.jboss.org/browse/TEIID-2309?page=com.atlassian.jira.plugin... ]
Steven Hawkins commented on TEIID-2309:
---------------------------------------
I'm proceeding to rough in the engine logic so that we can circle back to the metadata later. There are a couple of specific approaches:
1. For a conformed table create a pseudo-model that has capabilities representative of all sources involved (similar to the multisource approach)
- this has the least amount of replaning, but may not produce the expected plan when using there are limited capability conformed sources (this should be mitigated by rerunning earlier rules after rule plan joins and rule plan unions) and will likely not consider the full planning space (it can generally be a combinitoral problem).
2. Add a later optimization check in rule plan joins / plan unions / raise access to check for conformed tables. The drawback here is that the affected subplan may be switching sources, so the capabilities may need to be rechecked. Here again we will not consider the full planning space.
3. Create a full planning space of plan combinations based upon the tables involved. This is quite expensive for our current optimizer architecture.
All of the above can be simplified if we make an assumption (like with built-in mulit-source) such that the sources are assumed to have the same capabilities.
Another way to think of this is through the current multi-source feature:
* Create a multisource model for each similar set of conformed tables - this may not be ideal if the conformed tables appear unevenly on the sources and as mentioned above it assumes the same capabilities for each source.
* Expose the multi-source column through-out whenever a conformed table is used.
* Specify the target source as part of the user query via criteria - which is not ideal, but does force the selection of a single source (the moral equivalent of a hint)
So we could even approach the problem this way by adding an option to the multi-source logic to choose a single source in circumstances where a union would have been performed. This is more intrusive into modeling, but the most conceptually aligned to existing features.
> Add support for "conformed" tables in Teiid
> -------------------------------------------
>
> Key: TEIID-2309
> URL: https://issues.jboss.org/browse/TEIID-2309
> Project: Teiid
> Issue Type: Feature Request
> Components: Query Engine
> Reporter: Debbie Steigner
> Fix For: 8.6
>
>
> Teiid would support tables from different data sources being marked as "conformed", meaning they are the same (or perhaps a different name). When optimising a query, it would take the conformity into account and choose the appropriate copy of the table (presumably one in the same database as other tables in the query, if available). I would not regard it as a problem if Teiid *required* the dimensions to be strictly the same as opposed to permitting subsets, though as with so many areas, it would be up to the user to ensure this was really true: I would not expect the engine to do anything to verify that the tables really were conformed.
> Usecase:
> In Data Warehousing, it is relatively common to have multiple copies of the same dimensions spread over multiple Data Warehouses or Marts, or in the same Data Warehouse when associated with different Fact Tables. If these copies are either identical or strict subsets of an idealised dimension (and, by extension, share *exactly* the same naming and structure), then they may be said to be "conformed". It is expected that the dimension includes at least the values required to support the facts in the database in which it occurs or the Fact Table to which it is paired.
> Example:
>
> Source S1:
>
> BIGBIGBIG (millions of rows)
> bigkey
> ccy
> other_stuff
>
> CURRENCY (100s of rows) let's call it S1_CCY if we need to distinguish
> ccy
> ccy_name
>
>
> Source S2:
>
> BIGGER (millions of rows)
> biggerkey
> bigkey
> ccy
> more_stuff
>
> CURRENCY (100s of rows) similarly, S2_CCY
> ccy
> ccy_name
>
>
> When executing:
>
> SELECT B.*
> FROM BIGBIGBIG B,
> CURRENCY CCY
> WHERE B.ccy = CCY.ccy
> AND CCY.ccy_name LIKE "%DOLLAR%"
>
> Then it is clearly advantageous to use the copy of CURRENCY in S1 and re-write the query using S1_CCY. In this situation, federation is eliminated completely.
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 3 months
[JBoss JIRA] (TEIID-2673) CNF exception when getting data through infinispan-cache translator
by Van Halbert (JIRA)
Van Halbert created TEIID-2673:
----------------------------------
Summary: CNF exception when getting data through infinispan-cache translator
Key: TEIID-2673
URL: https://issues.jboss.org/browse/TEIID-2673
Project: Teiid
Issue Type: Bug
Components: Misc. Connectors
Affects Versions: 8.4.1
Reporter: Van Halbert
Assignee: Van Halbert
When getting data through infinispan-cache translator using remote hot rod access, DV will throw "...ClassNotFoundException: org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory".
Workaround:
Add "<module name="org.infinispan.client.hotrod"/>" as a dependency into module "org.jboss.teiid".
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 3 months