[
https://issues.jboss.org/browse/TEIID-3454?page=com.atlassian.jira.plugin...
]
Steven Hawkins commented on TEIID-3454:
---------------------------------------
There is a feature to perform a dependent join using a temp table. On the JDBC translator
the property enableDependentJoins needs to be set -
https://docs.jboss.org/author/display/TEIID/JDBC+Translator
But we don't currently have a hibernate dialect associated with netezza. The closest
would be postgresql. If also isn't yet any key creation on the temporary table.
The Hive translator would take more work as there isn't any temp support there yet.
Dependent Join optimizations for Netezza and Hive
-------------------------------------------------
Key: TEIID-3454
URL:
https://issues.jboss.org/browse/TEIID-3454
Project: Teiid
Issue Type: Feature Request
Components: Query Engine
Affects Versions: 8.10
Reporter: John Muller
Assignee: Steven Hawkins
Priority: Minor
Currently, dependent joins create 1 or more IN clauses. Many MPP / NoSQL systems can
have drastically better performance by creating temp tables that match key distributions.
Two examples I know of would be Netezza and Hive.
In Netezza, if the incoming dependent join (small dimension; here "Customer"
using Northwind data model concepts) has a key that will be joined to to a big fact table
that is DISTRIBUTED ON or ORGANIZED BY 'ed then creating a temp table that matches
this distribution will result in ~100x query performance. Sometimes, if the dimension is
small enough, this doesn't make a big difference as Netezza will perform a broadcast
join, but it's never a bad idea to create the temp table.
Similarly, Hive DDL has both partitions and buckets (pre-sorted).
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)