[teiid-issues] [JBoss JIRA] (TEIID-3454) Dependent Join optimizations for Netezza and Hive

Friday, 24 April 2015

    [
https://issues.jboss.org/browse/TEIID-3454?page=com.atlassian.jira.plugin...
] 

Steven Hawkins commented on TEIID-3454:
---------------------------------------

There is a feature to perform a dependent join using a temp table.  On the JDBC translator
the property enableDependentJoins needs to be set -
https://docs.jboss.org/author/display/TEIID/JDBC+Translator

But we don't currently have a hibernate dialect associated with netezza.  The closest
would be postgresql.  If also isn't yet any key creation on the temporary table.

The Hive translator would take more work as there isn't any temp support there yet.

...
 Dependent Join optimizations for Netezza and Hive
 -------------------------------------------------

                 Key: TEIID-3454
                 URL: https://issues.jboss.org/browse/TEIID-3454
             Project: Teiid
          Issue Type: Feature Request
          Components: Query Engine
    Affects Versions: 8.10
            Reporter: John Muller
            Assignee: Steven Hawkins
            Priority: Minor

 Currently, dependent joins create 1 or more IN clauses.  Many MPP / NoSQL systems can
have drastically better performance by creating temp tables that match key distributions. 
Two examples I know of would be Netezza and Hive.
 In Netezza, if the incoming dependent join (small dimension; here "Customer"
using Northwind data model concepts) has a key that will be joined to to a big fact table
that is DISTRIBUTED ON or ORGANIZED BY 'ed then creating a temp table that matches
this distribution will result in ~100x query performance.  Sometimes, if the dimension is
small enough, this doesn't make a big difference as Netezza will perform a broadcast
join, but it's never a bad idea to create the temp table.
 Similarly, Hive DDL has both partitions and buckets (pre-sorted). 

--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

[teiid-issues] [JBoss JIRA] (TEIID-3454) Dependent Join optimizations for Netezza and Hive