]
Work on TEIID-5354 started by Steven Hawkins.
---------------------------------------------
Different results for a query with GROUP BY and SUM(1) (summation of
the number 1) when there is a LEFT or INNER JOIN involved
------------------------------------------------------------------------------------------------------------------------------
Key: TEIID-5354
URL:
https://issues.jboss.org/browse/TEIID-5354
Project: Teiid
Issue Type: Bug
Components: Query Engine
Affects Versions: 10.2
Environment: teiid-10.2.0 on WildFly Full 11.0.0.Final (WildFly Core
3.0.8.Final)
Reporter: dalex dalex
Assignee: Steven Hawkins
Priority: Blocker
Attachments: address_pg.sql, stateprovince_mysql.sql
Teiid returns different results for a query with GROUP BY and SUM(1) (summation of the
number 1) when there is a LEFT or INNER JOIN involved.
There are two problems: a) incorrect check in MergeJoinStrategy.compareToPrevious method
which generates TEIID31202 exception and b) a bug in algorithm of join itself which leads
to incorrect results.
To reproduce the bug, please, run the following queries and compare theSum and theCount
column values:
{code:sql}
select r.city,
sum(1) as theSum
,count(*) as theCount
from "dsp.address_pg" r
left join "adventureworks.stateprovince" c on "r.stateprovinceid" =
"c.stateprovinceid"
group by r.city
order by r.city ;;
{code}
{code:sql}
select r.city,
sum(1) as theSum
,count(*) as theCount
from "dsp.address_pg" r
inner join "adventureworks.stateprovince" c on "r.stateprovinceid" =
"c.stateprovinceid"
group by r.city
order by r.city ;;
{code}
but the following queries return correct results:
{code:sql}
select r.city,
sum(1) as theSum
,count(*) as theCount
from "dsp.address_pg" r
right join "adventureworks.stateprovince" c on "r.stateprovinceid" =
"c.stateprovinceid"
group by r.city
order by r.city ;;
{code}
the second one is also correct (LEFT JOIN) but uses window function:
{code:sql}
select distinct city,
sum(1) OVER (PARTITION BY city) as theSum
,count(*) OVER (PARTITION BY city) as theCount
from "dsp.address_pg" r
left join "adventureworks.stateprovince" c on "r.stateprovinceid" =
"c.stateprovinceid"
order by r.city ;;
{code}