[teiid-issues] [JBoss JIRA] (TEIID-3748) Impala translator - SELECT and HAVING statements are translating differently for Case statements

Thu Oct 29 19:08:00 EDT 2015

    [ https://issues.jboss.org/browse/TEIID-3748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123590#comment-13123590 ] 

Steven Hawkins commented on TEIID-3748:
---------------------------------------

This is not reproducible for me.  There is no logic anywhere in the impala or general jdbc translator that will take the Teiid sql source query:

SELECT g_0.publisher_key, SUM(g_0.num_clicks), COUNT(DISTINCT CASE WHEN g_0.order_pub_comm_base >= 0 THEN g_0.orderid END) FROM fact_activity_advertiser_trans_date.fact_activity_advertiser_trans_date AS g_0 WHERE (g_0.trans_date_key >= '2015-09-01') AND (g_0.trans_date_key <= '2015-09-10') AND (g_0.advertiser_key = 2417) GROUP BY g_0.publisher_key HAVING (SUM(g_0.num_clicks) > 100) AND (COUNT(DISTINCT CASE WHEN g_0.order_pub_comm_base >= 0 THEN g_0.orderid END) = 0)

and insert the else clauses.

It would certainly appear that this and TEIID-3794 are due to a custom translator.  Can you confirm if you are using the standard impala translator?

> Impala translator - SELECT and HAVING statements are translating differently for Case statements
> ------------------------------------------------------------------------------------------------
>
>                 Key: TEIID-3748
>                 URL: https://issues.jboss.org/browse/TEIID-3748
>             Project: Teiid
>          Issue Type: Bug
>          Components: JDBC Connector
>    Affects Versions: 8.11.4
>         Environment: Ubuntu Trusty
>            Reporter: Don Krapohl
>            Assignee: Steven Hawkins
>              Labels: Impala_Translator, Translators
>             Fix For: 8.12.2
>
>         Attachments: server.log
>
>
> Error from Impala-
> all DISTINCT aggregate functions need to have the same set of parameters as count(DISTINCT (CASE WHEN (secondcol >= 0) THEN 1 ELSE CAST(NULL AS STRING) END))
> deviating function: count(DISTINCT (CASE WHEN (secondcol >= 0) THEN 1 ELSE NULL END))
> Query:
> SELECT user_key, sum(firstcol),count(distinct case when secondcol >= 0 then 1 end) 
> FROM sometable 
> WHERE customer_key=6
> GROUP BY user_key 
> HAVING sum(firstcol)>100 
> 	AND count(distinct case when secondcol >= 0 then 1 end)=0
> 	
> Query explanation:
> For all users
> Add up values in the firstcol column (integer column)
> count distinct values in secondcol where secondcol value zero or more
> 	otherwise return null (output is string)
> Translated Teiid query:
> SELECT user_key, SUM(firstcol) as `EXPR_0`, COUNT(DISTINCT (CASE WHEN (secondcol >= 0) THEN '1' ELSE CAST(NULL AS STRING) END)) as `EXPR_1`
> FROM sometable 
> WHERE customer_key` = 6
> HAVING (EXPR_0 > 100) AND (COUNT(DISTINCT (CASE WHEN (secondcol >= 0) THEN '1' ELSE NULL END)) = 0))
> Note the difference between the select and having for EXPR_1:
> Select - THEN '1' ELSE CAST(NULL AS STRING) END
> Having - THEN '1' ELSE NULL END
> Impala doesn't accept that these are the same aggregate function.  Aliases aren't accepted in the HAVING.
> One further observation- if we swap the translation and write the statement in the select as 
> COUNT(DISTINCT (CASE WHEN (secondcol >= 0) THEN '1' *ELSE NULL END*))
> Teiid translates the SELECT to
> COUNT(DISTINCT (CASE WHEN (secondcol >= 0) THEN '1' *ELSE CAST(NULL AS STRING) END*))
> So it always makes these mismatched.

--
This message was sent by Atlassian JIRA
(v6.4.11#64026)