[teiid-issues] [JBoss JIRA] (TEIID-4553) COUNT returns inconsistent results when column statistics are not gathered
Johnathon Lee (JIRA)
issues at jboss.org
Wed Dec 14 13:02:01 EST 2016
[ https://issues.jboss.org/browse/TEIID-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Johnathon Lee updated TEIID-4553:
---------------------------------
Fix Version/s: 8.12.9.6_3
> COUNT returns inconsistent results when column statistics are not gathered
> --------------------------------------------------------------------------
>
> Key: TEIID-4553
> URL: https://issues.jboss.org/browse/TEIID-4553
> Project: Teiid
> Issue Type: Bug
> Components: Query Engine
> Affects Versions: 8.7
> Reporter: Salvatore R
> Assignee: Steven Hawkins
> Priority: Blocker
> Fix For: 9.0.6, 9.1.2, 9.2, 8.12.9.6_3
>
>
> I am experiencing a weird behavior in 9.1.0-Final. Depending if column statistics are gathered or not, a query with COUNT(1) generates different query plans and returns different results.
> For example, I defined two simple tables ds1.test_count_1 and ds2.test_count_2 having just a single column and one row with value 1.
> The following query:
> {code:sql}
> select count(1) from ds1.test_count_1 t1 join ds2.test_count_2 t2 on t1.a=t2.a group by t1.a
> {code}
> returns 4 with statistics and 2 without.
> This is the plan generated when statistics are gathered (the correct one):
> {code:sql}
> ProjectNode
> + Relational Node ID:1
> + Output Columns:expr1 (integer)
> + Statistics:
> 0: Node Output Rows: 1
> 1: Node Next Batch Process Time: 0
> 2: Node Cumulative Next Batch Process Time: 1
> 3: Node Cumulative Process Time: 3
> 4: Node Next Batch Calls: 2
> 5: Node Blocks: 1
> + Cost Estimates:Estimated Node Cardinality: 1.5
> + Child 0:
> GroupingNode
> + Relational Node ID:2
> + Output Columns:agg0 (integer)
> + Statistics:
> 0: Node Output Rows: 1
> 1: Node Next Batch Process Time: 0
> 2: Node Cumulative Next Batch Process Time: 1
> 3: Node Cumulative Process Time: 3
> 4: Node Next Batch Calls: 2
> 5: Node Blocks: 1
> + Cost Estimates:Estimated Node Cardinality: 1.5
> + Child 0:
> JoinNode
> + Relational Node ID:3
> + Output Columns:a (string)
> + Statistics:
> 0: Node Output Rows: 4
> 1: Node Next Batch Process Time: 0
> 2: Node Cumulative Next Batch Process Time: 1
> 3: Node Cumulative Process Time: 3
> 4: Node Next Batch Calls: 2
> 5: Node Blocks: 1
> + Cost Estimates:Estimated Node Cardinality: 2.8284268
> + Child 0:
> AccessNode
> + Relational Node ID:4
> + Output Columns:a (string)
> + Statistics:
> 0: Node Output Rows: 2
> 1: Node Next Batch Process Time: 0
> 2: Node Cumulative Next Batch Process Time: 0
> 3: Node Cumulative Process Time: 2
> 4: Node Next Batch Calls: 3
> 5: Node Blocks: 2
> + Cost Estimates:Estimated Node Cardinality: 2.0
> + Query:SELECT g_0.a FROM ds1.test_count_1 AS g_0
> + Model Name:ds1
> + Child 1:
> AccessNode
> + Relational Node ID:5
> + Output Columns:a (string)
> + Statistics:
> 0: Node Output Rows: 2
> 1: Node Next Batch Process Time: 1
> 2: Node Cumulative Next Batch Process Time: 1
> 3: Node Cumulative Process Time: 3
> 4: Node Next Batch Calls: 2
> 5: Node Blocks: 1
> + Cost Estimates:Estimated Node Cardinality: 2.0
> + Query:SELECT g_0.a FROM ds2.test_count_2 AS g_0
> + Model Name:ds2
> + Join Strategy:ENHANCED SORT JOIN (SORT/SORT)
> + Join Type:INNER JOIN
> + Join Criteria:t1.a=t2.a
> + Grouping Columns:t1.a
> + Grouping Mapping:
> 0: anon_grp0.gcol0=t1.a
> 1: anon_grp0.agg0=COUNT(1)
> + Sort Mode:false
> + Select Columns:anon_grp0.agg0 AS expr1
> + Data Bytes Sent:17
> + Planning Time:3
> {code}
> and this is the one without statistics:
> {code:sql}
> ProjectNode
> + Relational Node ID:2
> + Output Columns:expr1 (integer)
> + Statistics:
> 0: Node Output Rows: 1
> 1: Node Next Batch Process Time: 0
> 2: Node Cumulative Next Batch Process Time: 1
> 3: Node Cumulative Process Time: 3
> 4: Node Next Batch Calls: 2
> 5: Node Blocks: 1
> + Cost Estimates:Estimated Node Cardinality: -1.0
> + Child 0:
> GroupingNode
> + Relational Node ID:3
> + Output Columns:agg0 (integer)
> + Statistics:
> 0: Node Output Rows: 1
> 1: Node Next Batch Process Time: 0
> 2: Node Cumulative Next Batch Process Time: 1
> 3: Node Cumulative Process Time: 3
> 4: Node Next Batch Calls: 2
> 5: Node Blocks: 1
> + Cost Estimates:Estimated Node Cardinality: -1.0
> + Child 0:
> JoinNode
> + Relational Node ID:4
> + Output Columns:gcol0 (string)
> + Statistics:
> 0: Node Output Rows: 2
> 1: Node Next Batch Process Time: 0
> 2: Node Cumulative Next Batch Process Time: 1
> 3: Node Cumulative Process Time: 3
> 4: Node Next Batch Calls: 2
> 5: Node Blocks: 1
> + Cost Estimates:Estimated Node Cardinality: -1.0
> + Child 0:
> AccessNode
> + Relational Node ID:5
> + Output Columns:gcol0 (string)
> + Statistics:
> 0: Node Output Rows: 1
> 1: Node Next Batch Process Time: 1
> 2: Node Cumulative Next Batch Process Time: 1
> 3: Node Cumulative Process Time: 3
> 4: Node Next Batch Calls: 3
> 5: Node Blocks: 2
> + Cost Estimates:Estimated Node Cardinality: -1.0
> + Query:SELECT g_0.a FROM ds1.test_count_1 AS g_0 GROUP BY g_0.a
> + Model Name:ds1
> + Child 1:
> AccessNode
> + Relational Node ID:6
> + Output Columns:a (string)
> + Statistics:
> 0: Node Output Rows: 2
> 1: Node Next Batch Process Time: 0
> 2: Node Cumulative Next Batch Process Time: 0
> 3: Node Cumulative Process Time: 3
> 4: Node Next Batch Calls: 2
> 5: Node Blocks: 1
> + Cost Estimates:Estimated Node Cardinality: -1.0
> + Query:SELECT g_0.a FROM ds2.test_count_2 AS g_0
> + Model Name:ds2
> + Join Strategy:ENHANCED SORT JOIN (SORT/SORT)
> + Join Type:INNER JOIN
> + Join Criteria:anon_grp1.gcol0=t2.a
> + Grouping Columns:anon_grp1.gcol0
> + Grouping Mapping:
> 0: anon_grp0.gcol0=anon_grp1.gcol0
> 1: anon_grp0.agg0=COUNT(1)
> + Sort Mode:false
> + Select Columns:anon_grp0.agg0 AS expr1
> + Data Bytes Sent:17
> + Planning Time:4
> {code}
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
More information about the teiid-issues
mailing list