[JBoss JIRA] (TEIID-5354) Different results for a query with GROUP BY and SUM(1) (summation of the number 1) when there is a LEFT or INNER JOIN involved
by Steven Hawkins (JIRA)
[ https://issues.jboss.org/browse/TEIID-5354?page=com.atlassian.jira.plugin... ]
Steven Hawkins commented on TEIID-5354:
---------------------------------------
Yes the partial aggregation logic is not considering the break down of an aggregate on a literal when there are other other decompositions. I'll have this addressed today.
> Different results for a query with GROUP BY and SUM(1) (summation of the number 1) when there is a LEFT or INNER JOIN involved
> ------------------------------------------------------------------------------------------------------------------------------
>
> Key: TEIID-5354
> URL: https://issues.jboss.org/browse/TEIID-5354
> Project: Teiid
> Issue Type: Bug
> Components: Query Engine
> Affects Versions: 10.2
> Environment: teiid-10.2.0 on WildFly Full 11.0.0.Final (WildFly Core 3.0.8.Final)
> Reporter: dalex dalex
> Assignee: Steven Hawkins
> Priority: Blocker
> Attachments: address_pg.sql, stateprovince_mysql.sql
>
>
> Teiid returns different results for a query with GROUP BY and SUM(1) (summation of the number 1) when there is a LEFT or INNER JOIN involved.
> There are two problems: a) incorrect check in MergeJoinStrategy.compareToPrevious method which generates TEIID31202 exception and b) a bug in algorithm of join itself which leads to incorrect results.
> To reproduce the bug, please, run the following queries and compare theSum and theCount column values:
> {code:sql}
> select r.city,
> sum(1) as theSum
> ,count(*) as theCount
> from "dsp.address_pg" r
> left join "adventureworks.stateprovince" c on "r.stateprovinceid" = "c.stateprovinceid"
> group by r.city
> order by r.city ;;
> {code}
> {code:sql}
> select r.city,
> sum(1) as theSum
> ,count(*) as theCount
> from "dsp.address_pg" r
> inner join "adventureworks.stateprovince" c on "r.stateprovinceid" = "c.stateprovinceid"
> group by r.city
> order by r.city ;;
> {code}
> but the following queries return correct results:
> {code:sql}
> select r.city,
> sum(1) as theSum
> ,count(*) as theCount
> from "dsp.address_pg" r
> right join "adventureworks.stateprovince" c on "r.stateprovinceid" = "c.stateprovinceid"
> group by r.city
> order by r.city ;;
> {code}
> the second one is also correct (LEFT JOIN) but uses window function:
> {code:sql}
> select distinct city,
> sum(1) OVER (PARTITION BY city) as theSum
> ,count(*) OVER (PARTITION BY city) as theCount
> from "dsp.address_pg" r
> left join "adventureworks.stateprovince" c on "r.stateprovinceid" = "c.stateprovinceid"
> order by r.city ;;
> {code}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
6 years, 7 months
[JBoss JIRA] (TEIID-5354) Different results for a query with GROUP BY and SUM(1) (summation of the number 1) when there is a LEFT or INNER JOIN involved
by Steven Hawkins (JIRA)
[ https://issues.jboss.org/browse/TEIID-5354?page=com.atlassian.jira.plugin... ]
Steven Hawkins commented on TEIID-5354:
---------------------------------------
> 4. Comment out the line in MergeJoinStrategy.compareToPrevious method
This was corrected by TEIID-5339
I should be able to access the rest shortly.
> Different results for a query with GROUP BY and SUM(1) (summation of the number 1) when there is a LEFT or INNER JOIN involved
> ------------------------------------------------------------------------------------------------------------------------------
>
> Key: TEIID-5354
> URL: https://issues.jboss.org/browse/TEIID-5354
> Project: Teiid
> Issue Type: Bug
> Components: Query Engine
> Affects Versions: 10.2
> Environment: teiid-10.2.0 on WildFly Full 11.0.0.Final (WildFly Core 3.0.8.Final)
> Reporter: dalex dalex
> Assignee: Steven Hawkins
> Priority: Blocker
> Attachments: address_pg.sql, stateprovince_mysql.sql
>
>
> Teiid returns different results for a query with GROUP BY and SUM(1) (summation of the number 1) when there is a LEFT or INNER JOIN involved.
> There are two problems: a) incorrect check in MergeJoinStrategy.compareToPrevious method which generates TEIID31202 exception and b) a bug in algorithm of join itself which leads to incorrect results.
> To reproduce the bug, please, run the following queries and compare theSum and theCount column values:
> {code:sql}
> select r.city,
> sum(1) as theSum
> ,count(*) as theCount
> from "dsp.address_pg" r
> left join "adventureworks.stateprovince" c on "r.stateprovinceid" = "c.stateprovinceid"
> group by r.city
> order by r.city ;;
> {code}
> {code:sql}
> select r.city,
> sum(1) as theSum
> ,count(*) as theCount
> from "dsp.address_pg" r
> inner join "adventureworks.stateprovince" c on "r.stateprovinceid" = "c.stateprovinceid"
> group by r.city
> order by r.city ;;
> {code}
> but the following queries return correct results:
> {code:sql}
> select r.city,
> sum(1) as theSum
> ,count(*) as theCount
> from "dsp.address_pg" r
> right join "adventureworks.stateprovince" c on "r.stateprovinceid" = "c.stateprovinceid"
> group by r.city
> order by r.city ;;
> {code}
> the second one is also correct (LEFT JOIN) but uses window function:
> {code:sql}
> select distinct city,
> sum(1) OVER (PARTITION BY city) as theSum
> ,count(*) OVER (PARTITION BY city) as theCount
> from "dsp.address_pg" r
> left join "adventureworks.stateprovince" c on "r.stateprovinceid" = "c.stateprovinceid"
> order by r.city ;;
> {code}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
6 years, 7 months
[JBoss JIRA] (TEIID-5354) Different results for a query with GROUP BY and SUM(1) (summation of the number 1) when there is a LEFT or INNER JOIN involved
by Steven Hawkins (JIRA)
[ https://issues.jboss.org/browse/TEIID-5354?page=com.atlassian.jira.plugin... ]
Work on TEIID-5354 started by Steven Hawkins.
---------------------------------------------
> Different results for a query with GROUP BY and SUM(1) (summation of the number 1) when there is a LEFT or INNER JOIN involved
> ------------------------------------------------------------------------------------------------------------------------------
>
> Key: TEIID-5354
> URL: https://issues.jboss.org/browse/TEIID-5354
> Project: Teiid
> Issue Type: Bug
> Components: Query Engine
> Affects Versions: 10.2
> Environment: teiid-10.2.0 on WildFly Full 11.0.0.Final (WildFly Core 3.0.8.Final)
> Reporter: dalex dalex
> Assignee: Steven Hawkins
> Priority: Blocker
> Attachments: address_pg.sql, stateprovince_mysql.sql
>
>
> Teiid returns different results for a query with GROUP BY and SUM(1) (summation of the number 1) when there is a LEFT or INNER JOIN involved.
> There are two problems: a) incorrect check in MergeJoinStrategy.compareToPrevious method which generates TEIID31202 exception and b) a bug in algorithm of join itself which leads to incorrect results.
> To reproduce the bug, please, run the following queries and compare theSum and theCount column values:
> {code:sql}
> select r.city,
> sum(1) as theSum
> ,count(*) as theCount
> from "dsp.address_pg" r
> left join "adventureworks.stateprovince" c on "r.stateprovinceid" = "c.stateprovinceid"
> group by r.city
> order by r.city ;;
> {code}
> {code:sql}
> select r.city,
> sum(1) as theSum
> ,count(*) as theCount
> from "dsp.address_pg" r
> inner join "adventureworks.stateprovince" c on "r.stateprovinceid" = "c.stateprovinceid"
> group by r.city
> order by r.city ;;
> {code}
> but the following queries return correct results:
> {code:sql}
> select r.city,
> sum(1) as theSum
> ,count(*) as theCount
> from "dsp.address_pg" r
> right join "adventureworks.stateprovince" c on "r.stateprovinceid" = "c.stateprovinceid"
> group by r.city
> order by r.city ;;
> {code}
> the second one is also correct (LEFT JOIN) but uses window function:
> {code:sql}
> select distinct city,
> sum(1) OVER (PARTITION BY city) as theSum
> ,count(*) OVER (PARTITION BY city) as theCount
> from "dsp.address_pg" r
> left join "adventureworks.stateprovince" c on "r.stateprovinceid" = "c.stateprovinceid"
> order by r.city ;;
> {code}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
6 years, 7 months
[JBoss JIRA] (TEIID-5354) Different results for a query with GROUP BY and SUM(1) (summation of the number 1) when there is a LEFT or INNER JOIN involved
by dalex dalex (JIRA)
dalex dalex created TEIID-5354:
----------------------------------
Summary: Different results for a query with GROUP BY and SUM(1) (summation of the number 1) when there is a LEFT or INNER JOIN involved
Key: TEIID-5354
URL: https://issues.jboss.org/browse/TEIID-5354
Project: Teiid
Issue Type: Bug
Components: Query Engine
Affects Versions: 10.2
Environment: teiid-10.2.0 on WildFly Full 11.0.0.Final (WildFly Core 3.0.8.Final)
Reporter: dalex dalex
Assignee: Steven Hawkins
Priority: Blocker
Attachments: address_pg.sql, stateprovince_mysql.sql
Teiid returns different results for a query with GROUP BY and SUM(1) (summation of the number 1) when there is a LEFT or INNER JOIN involved.
There are two problems: a) incorrect check in MergeJoinStrategy.compareToPrevious method which generates TEIID31202 exception and b) a bug in algorithm of join itself which leads to incorrect results.
To reproduce the bug, please, run the following queries and compare theSum and theCount column values:
{code:sql}
select r.city,
sum(1) as theSum
,count(*) as theCount
from "dsp.address_pg" r
left join "adventureworks.stateprovince" c on "r.stateprovinceid" = "c.stateprovinceid"
group by r.city
order by r.city ;;
{code}
{code:sql}
select r.city,
sum(1) as theSum
,count(*) as theCount
from "dsp.address_pg" r
inner join "adventureworks.stateprovince" c on "r.stateprovinceid" = "c.stateprovinceid"
group by r.city
order by r.city ;;
{code}
but the following queries return correct results:
{code:sql}
select r.city,
sum(1) as theSum
,count(*) as theCount
from "dsp.address_pg" r
right join "adventureworks.stateprovince" c on "r.stateprovinceid" = "c.stateprovinceid"
group by r.city
order by r.city ;;
{code}
the second one is also correct (LEFT JOIN) but uses window function:
{code:sql}
select distinct city,
sum(1) OVER (PARTITION BY city) as theSum
,count(*) OVER (PARTITION BY city) as theCount
from "dsp.address_pg" r
left join "adventureworks.stateprovince" c on "r.stateprovinceid" = "c.stateprovinceid"
order by r.city ;;
{code}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
6 years, 7 months
[JBoss JIRA] (TEIID-5353) Prepared Statement params are not pre-evaluated
by Steven Hawkins (JIRA)
[ https://issues.jboss.org/browse/TEIID-5353?page=com.atlassian.jira.plugin... ]
Steven Hawkins resolved TEIID-5353.
-----------------------------------
Fix Version/s: 11.0
Resolution: Done
It takes a fairly specific set of circumstances for this as more often than not the predicate ends up associated with an access node that will be pre-evaluated.
> Prepared Statement params are not pre-evaluated
> -----------------------------------------------
>
> Key: TEIID-5353
> URL: https://issues.jboss.org/browse/TEIID-5353
> Project: Teiid
> Issue Type: Bug
> Affects Versions: 8.12.12.6_4
> Reporter: Debbie Steigner
> Assignee: Steven Hawkins
> Fix For: 11.0
>
>
> prepared statements, [1] with both criteria as parameters for the prepared statement, the predicates are not evaluated until after batches are pulled. But for [2] if the SEARCH='FALSE' is in the query and not a param, it is pre-evaluated and we only run the one side of the union.
> [1]
> sql = "select * from " +
> "Inventory_Detail" +
> " WHERE SEARCH = (?) and ITEM_ID in (?) OFFSET 0 ROWS FETCH NEXT 500 ROWS ONLY";
> PreparedStatement preparedStatement = conn.prepareStatement(sql);
> preparedStatement.setString(1,"FALSE");
> preparedStatement.setString(2,"1005014161091");
> ResultSet rs = null;
> [2]
> sql = "select * from " +
> "Inventory_Detail" +
> " WHERE SEARCH = 'FALSE' and ITEM_ID in (?) OFFSET 0 ROWS FETCH NEXT 500 ROWS ONLY";
> PreparedStatement preparedStatement = conn.prepareStatement(sql);
> preparedStatement.setString(1,"1005014161091");
> ResultSet rs = null;
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
6 years, 7 months
[JBoss JIRA] (TEIID-5353) Prepared Statement params are not pre-evaluated
by Steven Hawkins (JIRA)
[ https://issues.jboss.org/browse/TEIID-5353?page=com.atlassian.jira.plugin... ]
Steven Hawkins updated TEIID-5353:
----------------------------------
Description:
prepared statements, [1] with both criteria as parameters for the prepared statement, the predicates are not evaluated until after batches are pulled. But for [2] if the SEARCH='FALSE' is in the query and not a param, it is pre-evaluated and we only run the one side of the union.
[1]
sql = "select * from " +
"Inventory_Detail" +
" WHERE SEARCH = (?) and ITEM_ID in (?) OFFSET 0 ROWS FETCH NEXT 500 ROWS ONLY";
PreparedStatement preparedStatement = conn.prepareStatement(sql);
preparedStatement.setString(1,"FALSE");
preparedStatement.setString(2,"1005014161091");
ResultSet rs = null;
[2]
sql = "select * from " +
"Inventory_Detail" +
" WHERE SEARCH = 'FALSE' and ITEM_ID in (?) OFFSET 0 ROWS FETCH NEXT 500 ROWS ONLY";
PreparedStatement preparedStatement = conn.prepareStatement(sql);
preparedStatement.setString(1,"1005014161091");
ResultSet rs = null;
was:
prepared statements, [1] with both criteria as parameters for the prepared statement, it appears that we are ignoring the SEARCH criteria because we are running both sides of the union. But for [2] if the SEARCH='FALSE' is in the query and not a param, we process as expected and only run the one side of the union.
[1]
sql = "select * from " +
"Inventory_Detail" +
" WHERE SEARCH = (?) and ITEM_ID in (?) OFFSET 0 ROWS FETCH NEXT 500 ROWS ONLY";
PreparedStatement preparedStatement = conn.prepareStatement(sql);
preparedStatement.setString(1,"FALSE");
preparedStatement.setString(2,"1005014161091");
ResultSet rs = null;
[2]
sql = "select * from " +
"Inventory_Detail" +
" WHERE SEARCH = 'FALSE' and ITEM_ID in (?) OFFSET 0 ROWS FETCH NEXT 500 ROWS ONLY";
PreparedStatement preparedStatement = conn.prepareStatement(sql);
preparedStatement.setString(1,"1005014161091");
ResultSet rs = null;
> Prepared Statement params are not pre-evaluated
> -----------------------------------------------
>
> Key: TEIID-5353
> URL: https://issues.jboss.org/browse/TEIID-5353
> Project: Teiid
> Issue Type: Bug
> Affects Versions: 8.12.12.6_4
> Reporter: Debbie Steigner
> Assignee: Steven Hawkins
>
> prepared statements, [1] with both criteria as parameters for the prepared statement, the predicates are not evaluated until after batches are pulled. But for [2] if the SEARCH='FALSE' is in the query and not a param, it is pre-evaluated and we only run the one side of the union.
> [1]
> sql = "select * from " +
> "Inventory_Detail" +
> " WHERE SEARCH = (?) and ITEM_ID in (?) OFFSET 0 ROWS FETCH NEXT 500 ROWS ONLY";
> PreparedStatement preparedStatement = conn.prepareStatement(sql);
> preparedStatement.setString(1,"FALSE");
> preparedStatement.setString(2,"1005014161091");
> ResultSet rs = null;
> [2]
> sql = "select * from " +
> "Inventory_Detail" +
> " WHERE SEARCH = 'FALSE' and ITEM_ID in (?) OFFSET 0 ROWS FETCH NEXT 500 ROWS ONLY";
> PreparedStatement preparedStatement = conn.prepareStatement(sql);
> preparedStatement.setString(1,"1005014161091");
> ResultSet rs = null;
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
6 years, 7 months