[
https://issues.jboss.org/browse/TEIID-3743?page=com.atlassian.jira.plugin...
]
Scott Wallace commented on TEIID-3743:
--------------------------------------
Another bit of information - a query just asking for multiple count distincts works
without any group by. As soon as group by is added, the query no longer works.
Eg:
This works:
{noformat}SELECT count(distinct orderid)
,count(DISTINCT hash_tid) AS transactions
,sum(sales)
FROM ImpressionClickTransactionProcessDateDetail
WHERE process_date_key >= '2016-05-22'
AND process_date_key < '2016-05-23'
AND advertiser_key = 2149{noformat}
This does not work:
{noformat}
SELECT process_date_key
,count(distinct orderid)
,count(DISTINCT hash_tid) AS transactions
,sum(sales)
FROM ImpressionClickTransactionProcessDateDetail
WHERE process_date_key >= '2016-05-22'
AND process_date_key < '2016-05-23'
AND advertiser_key = 2149
group by process_date_key
{noformat}
Multiple Count Distinct Columns Fails for Impala
------------------------------------------------
Key: TEIID-3743
URL:
https://issues.jboss.org/browse/TEIID-3743
Project: Teiid
Issue Type: Bug
Components: Misc. Connectors
Affects Versions: 8.11.4
Reporter: Scott Wallace
Assignee: Steven Hawkins
Fix For: 8.12
Teiid Impala translator incorrectly allows multiple count distinct columns, which fails
to execute since it is not supported by Impala.
Per Cloudera documentation:
{quote}By default, Impala only allows a single COUNT(DISTINCT columns) expression in each
query.
To produce the same result as multiple COUNT(DISTINCT) expressions, you can use the
following technique for queries involving a single table:
select v1.c1 result1, v2.c1 result2 from (select count(distinct col1) as c1 from t1) v1
cross join (select count(distinct col2) as c1 from t1) v2;{quote}
The Teiid-Impala translator should ideally rewrite the query as necessary so that the
query does not fail executing multiple count distincts against Impala.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)