[teiid-issues] [JBoss JIRA] (TEIID-4553) COUNT returns inconsistent results when column statistics are not gathered

Steven Hawkins (JIRA) issues at jboss.org
Wed Dec 6 10:33:19 EST 2017


     [ https://issues.jboss.org/browse/TEIID-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Steven Hawkins closed TEIID-4553.
---------------------------------


> COUNT returns inconsistent results when column statistics are not gathered
> --------------------------------------------------------------------------
>
>                 Key: TEIID-4553
>                 URL: https://issues.jboss.org/browse/TEIID-4553
>             Project: Teiid
>          Issue Type: Bug
>          Components: Query Engine
>    Affects Versions: 8.7
>            Reporter: Salvatore R
>            Assignee: Steven Hawkins
>            Priority: Blocker
>             Fix For: 9.0.6, 9.1.2, 9.2, 8.12.9.6_3
>
>
> I am experiencing a weird behavior in 9.1.0-Final. Depending if column statistics are gathered or not, a query with COUNT(1) generates different query plans and returns different results.
> For example, I defined two simple tables ds1.test_count_1 and ds2.test_count_2 having just a single column and one row with value 1.
> The following query:
> {code:sql}
> select count(1) from ds1.test_count_1 t1 join ds2.test_count_2 t2 on t1.a=t2.a group by t1.a
> {code}
> returns 4 with statistics and 2 without.
> This is the plan generated when statistics are gathered (the correct one):
> {code:sql}
> ProjectNode
>   + Relational Node ID:1
>   + Output Columns:expr1 (integer)
>   + Statistics:
>     0: Node Output Rows: 1
>     1: Node Next Batch Process Time: 0
>     2: Node Cumulative Next Batch Process Time: 1
>     3: Node Cumulative Process Time: 3
>     4: Node Next Batch Calls: 2
>     5: Node Blocks: 1
>   + Cost Estimates:Estimated Node Cardinality: 1.5
>   + Child 0:
>     GroupingNode
>       + Relational Node ID:2
>       + Output Columns:agg0 (integer)
>       + Statistics:
>         0: Node Output Rows: 1
>         1: Node Next Batch Process Time: 0
>         2: Node Cumulative Next Batch Process Time: 1
>         3: Node Cumulative Process Time: 3
>         4: Node Next Batch Calls: 2
>         5: Node Blocks: 1
>       + Cost Estimates:Estimated Node Cardinality: 1.5
>       + Child 0:
>         JoinNode
>           + Relational Node ID:3
>           + Output Columns:a (string)
>           + Statistics:
>             0: Node Output Rows: 4
>             1: Node Next Batch Process Time: 0
>             2: Node Cumulative Next Batch Process Time: 1
>             3: Node Cumulative Process Time: 3
>             4: Node Next Batch Calls: 2
>             5: Node Blocks: 1
>           + Cost Estimates:Estimated Node Cardinality: 2.8284268
>           + Child 0:
>             AccessNode
>               + Relational Node ID:4
>               + Output Columns:a (string)
>               + Statistics:
>                 0: Node Output Rows: 2
>                 1: Node Next Batch Process Time: 0
>                 2: Node Cumulative Next Batch Process Time: 0
>                 3: Node Cumulative Process Time: 2
>                 4: Node Next Batch Calls: 3
>                 5: Node Blocks: 2
>               + Cost Estimates:Estimated Node Cardinality: 2.0
>               + Query:SELECT g_0.a FROM ds1.test_count_1 AS g_0
>               + Model Name:ds1
>           + Child 1:
>             AccessNode
>               + Relational Node ID:5
>               + Output Columns:a (string)
>               + Statistics:
>                 0: Node Output Rows: 2
>                 1: Node Next Batch Process Time: 1
>                 2: Node Cumulative Next Batch Process Time: 1
>                 3: Node Cumulative Process Time: 3
>                 4: Node Next Batch Calls: 2
>                 5: Node Blocks: 1
>               + Cost Estimates:Estimated Node Cardinality: 2.0
>               + Query:SELECT g_0.a FROM ds2.test_count_2 AS g_0
>               + Model Name:ds2
>           + Join Strategy:ENHANCED SORT JOIN (SORT/SORT)
>           + Join Type:INNER JOIN
>           + Join Criteria:t1.a=t2.a
>       + Grouping Columns:t1.a
>       + Grouping Mapping:
>         0: anon_grp0.gcol0=t1.a
>         1: anon_grp0.agg0=COUNT(1)
>       + Sort Mode:false
>   + Select Columns:anon_grp0.agg0 AS expr1
>   + Data Bytes Sent:17
>   + Planning Time:3
> {code}
> and this is the one without statistics:
> {code:sql}
> ProjectNode
>   + Relational Node ID:2
>   + Output Columns:expr1 (integer)
>   + Statistics:
>     0: Node Output Rows: 1
>     1: Node Next Batch Process Time: 0
>     2: Node Cumulative Next Batch Process Time: 1
>     3: Node Cumulative Process Time: 3
>     4: Node Next Batch Calls: 2
>     5: Node Blocks: 1
>   + Cost Estimates:Estimated Node Cardinality: -1.0
>   + Child 0:
>     GroupingNode
>       + Relational Node ID:3
>       + Output Columns:agg0 (integer)
>       + Statistics:
>         0: Node Output Rows: 1
>         1: Node Next Batch Process Time: 0
>         2: Node Cumulative Next Batch Process Time: 1
>         3: Node Cumulative Process Time: 3
>         4: Node Next Batch Calls: 2
>         5: Node Blocks: 1
>       + Cost Estimates:Estimated Node Cardinality: -1.0
>       + Child 0:
>         JoinNode
>           + Relational Node ID:4
>           + Output Columns:gcol0 (string)
>           + Statistics:
>             0: Node Output Rows: 2
>             1: Node Next Batch Process Time: 0
>             2: Node Cumulative Next Batch Process Time: 1
>             3: Node Cumulative Process Time: 3
>             4: Node Next Batch Calls: 2
>             5: Node Blocks: 1
>           + Cost Estimates:Estimated Node Cardinality: -1.0
>           + Child 0:
>             AccessNode
>               + Relational Node ID:5
>               + Output Columns:gcol0 (string)
>               + Statistics:
>                 0: Node Output Rows: 1
>                 1: Node Next Batch Process Time: 1
>                 2: Node Cumulative Next Batch Process Time: 1
>                 3: Node Cumulative Process Time: 3
>                 4: Node Next Batch Calls: 3
>                 5: Node Blocks: 2
>               + Cost Estimates:Estimated Node Cardinality: -1.0
>               + Query:SELECT g_0.a FROM ds1.test_count_1 AS g_0 GROUP BY g_0.a
>               + Model Name:ds1
>           + Child 1:
>             AccessNode
>               + Relational Node ID:6
>               + Output Columns:a (string)
>               + Statistics:
>                 0: Node Output Rows: 2
>                 1: Node Next Batch Process Time: 0
>                 2: Node Cumulative Next Batch Process Time: 0
>                 3: Node Cumulative Process Time: 3
>                 4: Node Next Batch Calls: 2
>                 5: Node Blocks: 1
>               + Cost Estimates:Estimated Node Cardinality: -1.0
>               + Query:SELECT g_0.a FROM ds2.test_count_2 AS g_0
>               + Model Name:ds2
>           + Join Strategy:ENHANCED SORT JOIN (SORT/SORT)
>           + Join Type:INNER JOIN
>           + Join Criteria:anon_grp1.gcol0=t2.a
>       + Grouping Columns:anon_grp1.gcol0
>       + Grouping Mapping:
>         0: anon_grp0.gcol0=anon_grp1.gcol0
>         1: anon_grp0.agg0=COUNT(1)
>       + Sort Mode:false
>   + Select Columns:anon_grp0.agg0 AS expr1
>   + Data Bytes Sent:17
>   + Planning Time:4
> {code}



--
This message was sent by Atlassian JIRA
(v7.5.0#75005)


More information about the teiid-issues mailing list