[JBoss JIRA] (TEIID-4227) Impala Translator - Support Multiple Count Distinct with Group By - teiid-issues

Thursday, 26 May 2016

    [
https://issues.jboss.org/browse/TEIID-4227?page=com.atlassian.jira.plugin...
] 

Steven Hawkins commented on TEIID-4227:
---------------------------------------

...
 Hi Steven Hawkins an approximate count distinct isn't going to
work for us as our reporting requires exact results. 
That's unfortunate as that would greatly simply things.  Do you have an idea of the
performance implications here?  That is there a threshold at which it's better for
Teiid to perform the aggregation?  For example would it be possible to test with a patched
or other translator modification that disables group by support?

...
 One solution could be to use NVL around the join column 
That does not work as you can conflict with actual values.

...
 Impala Translator - Support Multiple Count Distinct with Group By
 -----------------------------------------------------------------

                 Key: TEIID-4227
                 URL: https://issues.jboss.org/browse/TEIID-4227
             Project: Teiid
          Issue Type: Feature Request
          Components: Misc. Connectors
    Affects Versions: 8.13.3
            Reporter: Scott Wallace
            Assignee: Steven Hawkins

 This is somewhat related to TEIID-3743, which added support of multiple count distinct
metrics. We have a requirement to allow multiple count distinct metrics grouped by common
attributes.
 Currently, we get a failure as follows:
 {noformat}org.teiid.runtime.client.TeiidClientException: java.lang.RuntimeException:
Remote org.teiid.core.TeiidProcessingException: TEIID30504
vw_impression_click_transaction_process_date_detail: 500051 TEIID11008:TEIID11004 Error
executing statement(s): [Prepared Values: [] SQL: SELECT g_0.process_date_key,
SUM(g_0.num_clicks), SUM(g_0.num_impressions), COUNT(DISTINCT g_0.orderid), COUNT(DISTINCT
g_0.`hash_tid`), SUM(g_0.sales) FROM
detail.vw_impression_click_transaction_process_date_detail g_0 WHERE g_0.process_date_key
>= '2016-05-22' AND g_0.process_date_key < '2016-05-23' AND
g_0.advertiser_key = 12345 GROUP BY g_0.process_date_key]
 {noformat}
 To allow this request to succeed, we could have translator write that Impala query as:
 {noformat}
 select nvl(v1.c1,v2.c1), nvl(v1.c2,v1.c2), nvl(v1.c3,v1.c3), v1.c4, v2.c5,
nvl(v1.c6,v1.c6)  from (
 SELECT g_0.process_date_key as c1,
          SUM(g_0.num_clicks) as c2,
          SUM(g_0.num_impressions) as c3,
          COUNT(DISTINCT g_0.orderid) as c4,
          SUM(g_0.sales) as c6
 FROM detail.vw_impression_click_transaction_process_date_detail g_0
 WHERE g_0.process_date_key >= '2016-05-22'
         AND g_0.process_date_key < '2016-05-23'
         AND g_0.advertiser_key = 12345
 GROUP BY  g_0.process_date_key
   ) v1 inner join
   (
     SELECT g_0.process_date_key as c1,
          SUM(g_0.num_clicks) as c2,
          SUM(g_0.num_impressions) as c3,
          COUNT(DISTINCT g_0.`hash_tid`) as c5,
          SUM(g_0.sales) as c6
 FROM detail.vw_impression_click_transaction_process_date_detail g_0
 WHERE g_0.process_date_key >= '2016-05-22'
         AND g_0.process_date_key < '2016-05-23'
         AND g_0.advertiser_key = 12345
 GROUP BY  g_0.process_date_key
    ) v2 on v1.c1 = v2.c1;
 {noformat}
 FYI [~shawkins] 

--
This message was sent by Atlassian JIRA
(v6.4.11#64026)

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

[JBoss JIRA] (TEIID-4227) Impala Translator - Support Multiple Count Distinct with Group By