[teiid-issues] [JBoss JIRA] (TEIID-5354) Different results for a query with GROUP BY and SUM(1) (summation of the number 1) when there is a LEFT or INNER JOIN involved

dalex dalex (JIRA) issues at jboss.org
Tue May 29 07:01:00 EDT 2018


dalex dalex created TEIID-5354:
----------------------------------

             Summary: Different results for a query with GROUP BY and SUM(1) (summation of the number 1) when there is a LEFT or INNER JOIN involved
                 Key: TEIID-5354
                 URL: https://issues.jboss.org/browse/TEIID-5354
             Project: Teiid
          Issue Type: Bug
          Components: Query Engine
    Affects Versions: 10.2
         Environment: teiid-10.2.0 on WildFly Full 11.0.0.Final (WildFly Core 3.0.8.Final)
            Reporter: dalex dalex
            Assignee: Steven Hawkins
            Priority: Blocker
         Attachments: address_pg.sql, stateprovince_mysql.sql

Teiid returns different results for a query with GROUP BY and SUM(1) (summation of the number 1) when there is a LEFT or INNER JOIN involved.
There are two problems: a) incorrect check in MergeJoinStrategy.compareToPrevious method which generates TEIID31202 exception and b) a bug in algorithm of join itself which leads to incorrect results.
To reproduce the bug, please, run the following queries and compare theSum and theCount column values:
{code:sql}
select  r.city,
		sum(1) as theSum
		,count(*) as theCount
from "dsp.address_pg" r
	left join "adventureworks.stateprovince" c on "r.stateprovinceid" = "c.stateprovinceid" 
group by r.city
order by r.city ;;
{code}

{code:sql}
select  r.city,
		sum(1) as theSum
		,count(*) as theCount
from "dsp.address_pg" r
	inner join "adventureworks.stateprovince" c on "r.stateprovinceid" = "c.stateprovinceid" 
group by r.city
order by r.city ;;
{code}

but the following queries return correct results:
{code:sql}
select  r.city,
		sum(1) as theSum
		,count(*) as theCount
from "dsp.address_pg" r
	right join "adventureworks.stateprovince" c on "r.stateprovinceid" = "c.stateprovinceid" 
group by r.city
order by r.city ;;
{code}

the second one is also correct (LEFT JOIN) but uses window function:
{code:sql}
select  distinct city,
		sum(1) OVER (PARTITION BY city) as theSum
		,count(*) OVER (PARTITION BY city) as theCount
from "dsp.address_pg" r
	left join "adventureworks.stateprovince" c on "r.stateprovinceid" = "c.stateprovinceid" 
order by r.city ;;
{code}



--
This message was sent by Atlassian JIRA
(v7.5.0#75005)


More information about the teiid-issues mailing list