[teiid-issues] [JBoss JIRA] (TEIID-3743) Multiple Count Distinct Columns Fails for Impala

Scott Wallace (JIRA) issues at jboss.org
Wed May 25 11:38:00 EDT 2016


    [ https://issues.jboss.org/browse/TEIID-3743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242734#comment-13242734 ] 

Scott Wallace commented on TEIID-3743:
--------------------------------------

Another bit of information - a query just asking for multiple count distincts works without any group by. As soon as group by is added, the query no longer works.

Eg:
This works:
{noformat}SELECT count(distinct orderid)
	,count(DISTINCT hash_tid) AS transactions
	,sum(sales)
FROM ImpressionClickTransactionProcessDateDetail
WHERE process_date_key >= '2016-05-22'
	AND process_date_key < '2016-05-23'
	AND advertiser_key = 2149{noformat}

This does not work:
{noformat}
SELECT process_date_key 
    ,count(distinct orderid)
	,count(DISTINCT hash_tid) AS transactions
	,sum(sales)
FROM ImpressionClickTransactionProcessDateDetail
WHERE process_date_key >= '2016-05-22'
	AND process_date_key < '2016-05-23'
	AND advertiser_key = 2149
	group by process_date_key
{noformat}

> Multiple Count Distinct Columns Fails for Impala
> ------------------------------------------------
>
>                 Key: TEIID-3743
>                 URL: https://issues.jboss.org/browse/TEIID-3743
>             Project: Teiid
>          Issue Type: Bug
>          Components: Misc. Connectors
>    Affects Versions: 8.11.4
>            Reporter: Scott Wallace
>            Assignee: Steven Hawkins
>             Fix For: 8.12
>
>
> Teiid Impala translator incorrectly allows multiple count distinct columns, which fails to execute since it is not supported by Impala.
> Per Cloudera documentation:
> {quote}By default, Impala only allows a single COUNT(DISTINCT columns) expression in each query.
> To produce the same result as multiple COUNT(DISTINCT) expressions, you can use the following technique for queries involving a single table:
> select v1.c1 result1, v2.c1 result2 from (select count(distinct col1) as c1 from t1) v1 cross join (select count(distinct col2) as c1 from t1) v2;{quote}
> The Teiid-Impala translator should ideally rewrite the query as necessary so that the query does not fail executing multiple count distincts against Impala.



--
This message was sent by Atlassian JIRA
(v6.4.11#64026)


More information about the teiid-issues mailing list