[JBoss JIRA] (TEIID-3743) Multiple Count Distinct Columns Fails for Impala

Wednesday, 25 May 2016

    [
https://issues.jboss.org/browse/TEIID-3743?page=com.atlassian.jira.plugin...
] 

Scott Wallace commented on TEIID-3743:
--------------------------------------

Another bit of information - a query just asking for multiple count distincts works
without any group by. As soon as group by is added, the query no longer works.

Eg:
This works:
{noformat}SELECT count(distinct orderid)
	,count(DISTINCT hash_tid) AS transactions
	,sum(sales)
FROM ImpressionClickTransactionProcessDateDetail
WHERE process_date_key >= '2016-05-22'
	AND process_date_key < '2016-05-23'
	AND advertiser_key = 2149{noformat}

This does not work:
{noformat}
SELECT process_date_key 
    ,count(distinct orderid)
	,count(DISTINCT hash_tid) AS transactions
	,sum(sales)
FROM ImpressionClickTransactionProcessDateDetail
WHERE process_date_key >= '2016-05-22'
	AND process_date_key < '2016-05-23'
	AND advertiser_key = 2149
	group by process_date_key
{noformat}

...
 Multiple Count Distinct Columns Fails for Impala
 ------------------------------------------------

                 Key: TEIID-3743
                 URL: https://issues.jboss.org/browse/TEIID-3743
             Project: Teiid
          Issue Type: Bug
          Components: Misc. Connectors
    Affects Versions: 8.11.4
            Reporter: Scott Wallace
            Assignee: Steven Hawkins
             Fix For: 8.12

 Teiid Impala translator incorrectly allows multiple count distinct columns, which fails
to execute since it is not supported by Impala.
 Per Cloudera documentation:
 {quote}By default, Impala only allows a single COUNT(DISTINCT columns) expression in each
query.
 To produce the same result as multiple COUNT(DISTINCT) expressions, you can use the
following technique for queries involving a single table:
 select v1.c1 result1, v2.c1 result2 from (select count(distinct col1) as c1 from t1) v1
cross join (select count(distinct col2) as c1 from t1) v2;{quote}
 The Teiid-Impala translator should ideally rewrite the query as necessary so that the
query does not fail executing multiple count distincts against Impala. 

--
This message was sent by Atlassian JIRA
(v6.4.11#64026)

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009