]
Steven Hawkins logged work on TEIID-5959:
-----------------------------------------
Author: Steven Hawkins
Created on: 28/May/20 10:56 PM
Start Date: 28/May/20 10:56 PM
Worklog Time Spent: 4 hours
Issue Time Tracking
-------------------
Remaining Estimate: 0 minutes (was: 4 hours)
Time Spent: 4 hours
Worklog Id: (was: 12451173)
Teiid doesn't re-optimize select sub-nodes after adding new
criteria
--------------------------------------------------------------------
Key: TEIID-5959
URL:
https://issues.redhat.com/browse/TEIID-5959
Project: Teiid
Issue Type: Quality Risk
Components: Query Engine
Affects Versions: 13.1
Reporter: Dmitrii Pogorelov
Assignee: Steven Hawkins
Priority: Major
Fix For: 15.0
Original Estimate: 4 hours
Time Spent: 4 hours
Remaining Estimate: 0 minutes
It looks like Teiid doesn't re-optimize union nodes after adding new criteria to
their sub-select nodes. For example, for the query (the user has only role1):
{code:sql}
-- QUERY 1
select * from "test_tables_pg.test_a" where hasRole('report-role')
union all
select * from "test_tables_pg2.test_a" where hasRole('role1') ;;
{code}
Teiid will generate the following Query plan:
{code:xml}
AccessNode(0) output=[test_tables_pg2.test_a.a, test_tables_pg2.test_a.b] SELECT g_0.a AS
c_0, g_0.b AS c_1 FROM test_tables_pg2.test_a AS g_0 LIMIT 100
{code}
what is correct because the user has only role1 and Teiid optimizes union removing the
first sub-query as its condition in total is false.
But for this query:
{code:sql}
-- QUERY 2 (adds a where condition)
select * from (
select * from "test_tables_pg.test_a" where hasRole('report-role')
union all
select * from "test_tables_pg2.test_a" where hasRole('role1')) a
where a = 1 ;;
{code}
Teiid will propagate the "a = 1" condition in all union sub-queries and
generate the following query plan:
{code:xml}
LimitNode(0) output=[test_tables_pg.test_a.a, test_tables_pg.test_a.b] limit 100
UnionAllNode(1) output=[test_tables_pg.test_a.a, test_tables_pg.test_a.b]
AccessNode(2) output=[test_tables_pg.test_a.a, test_tables_pg.test_a.b] SELECT g_0.a
AS c_0, g_0.b AS c_1 FROM test_tables_pg.test_a AS g_0 WHERE (FALSE = TRUE) AND (g_0.a =
1) LIMIT 100
AccessNode(3) output=[test_tables_pg2.test_a.a, test_tables_pg2.test_a.b] SELECT
g_0.a AS c_0, g_0.b AS c_1 FROM test_tables_pg2.test_a AS g_0 WHERE g_0.a = 1 LIMIT 100
{code}
what is incorrect as the
{code:xml}
SELECT g_0.a AS c_0, g_0.b AS c_1 FROM test_tables_pg.test_a AS g_0 WHERE (FALSE = TRUE)
AND (g_0.a = 1) LIMIT 100
{code}
still has false condition and logically the query plan for the second query should be the
same as for the first one, though it doesn't have an influence on results.
I don't know whether it's a bug it not. That's why I created the issue with
the "Quality Risk" type.
Steps to reproduce:
1. In PostgreSQL create the test_a table using the following scripts:
{code:sql}
CREATE TABLE public.test_a
(
a integer,
b integer
);
INSERT INTO public.test_a(a, b) VALUES (1, 1);
INSERT INTO public.test_a(a, b) VALUES (1, 2);
INSERT INTO public.test_a(a, b) VALUES (2, 1);
INSERT INTO public.test_a(a, b) VALUES (2, 2);
INSERT INTO public.test_a(a, b) VALUES (3, 2);
INSERT INTO public.test_a(a, b) VALUES (3, 10);
{code}
2. Add PostgreSQL database configuration in standalone-teiid.xml:
{code:xml}
<datasource jndi-name="java:/test_tables_pg"
pool-name="test_tables_pg" enabled="true"
use-java-context="true">
<connection-url>jdbc:postgresql://localhost:5432/test?charSet=utf8</connection-url>
<driver-class>org.postgresql.Driver</driver-class>
<driver>org.postgresql</driver>
<pool>
<min-pool-size>2</min-pool-size>
<max-pool-size>70</max-pool-size>
<prefill>false</prefill>
<use-strict-min>false</use-strict-min>
<flush-strategy>FailingConnectionOnly</flush-strategy>
</pool>
<security>
<user-name>XXXXX</user-name>
<password>XXXXX</password>
</security>
<validation>
<check-valid-connection-sql>select
0</check-valid-connection-sql>
</validation>
<timeout>
<blocking-timeout-millis>120000</blocking-timeout-millis>
<idle-timeout-minutes>5</idle-timeout-minutes>
</timeout>
</datasource>
<datasource jndi-name="java:/test_tables_pg2"
pool-name="test_tables_pg2" enabled="true"
use-java-context="true">
<connection-url>jdbc:postgresql://localhost:5432/test?charSet=utf8</connection-url>
<driver-class>org.postgresql.Driver</driver-class>
<driver>org.postgresql</driver>
<pool>
<min-pool-size>2</min-pool-size>
<max-pool-size>70</max-pool-size>
<prefill>false</prefill>
<use-strict-min>false</use-strict-min>
<flush-strategy>FailingConnectionOnly</flush-strategy>
</pool>
<security>
<user-name>XXXXX</user-name>
<password>XXXXX</password>
</security>
<validation>
<check-valid-connection-sql>select
0</check-valid-connection-sql>
</validation>
<timeout>
<blocking-timeout-millis>120000</blocking-timeout-millis>
<idle-timeout-minutes>5</idle-timeout-minutes>
</timeout>
</datasource>
{code}
3. Add in test-vdb.xml java:/test_tables_pg configured in the previous step as a data
source:
{code:xml}
<model name="test_tables_pg">
<property name="importer.useFullSchemaName"
value="false"/>
<property name="importer.tableTypes"
value="TABLE,VIEW"/>
<property name="importer.importKeys" value="false"/>
<source name="test_tables_pg" translator-name="myPg"
connection-jndi-name="java:/test_tables_pg"/>
</model>
<model name="test_tables_pg2">
<property name="importer.useFullSchemaName"
value="false"/>
<property name="importer.tableTypes"
value="TABLE,VIEW"/>
<property name="importer.importKeys" value="false"/>
<source name="test_tables_pg2" translator-name="myPg"
connection-jndi-name="java:/test_tables_pg2"/>
</model>
{code}
4. add the "role1" configuration:
{code:xml}
<data-role name="role1" any-authenticated="true"
allow-create-temporary-tables="true">
<description>Allow read only</description>
<permission>
<resource-name>test_tables_pg2.test_a</resource-name>
<allow-read>true</allow-read>
<allow-execute>true</allow-execute>
</permission>
<permission>
<resource-name>test_tables_pg.test_a</resource-name>
<allow-read>true</allow-read>
<allow-execute>true</allow-execute>
</permission>
</data-role>
{code}
5. run the following queries and compare their query plans:
{code:sql}
-- QUERY 1
select * from "test_tables_pg.test_a" where hasRole('report-role')
union all
select * from "test_tables_pg2.test_a" where hasRole('role1') ;;
select hasRole('data','report-role') ;;
select hasRole('role1') ;;
-- QUERY 2 (adds a where condition)
select * from (
select * from "test_tables_pg.test_a" where hasRole('report-role')
union all
select * from "test_tables_pg2.test_a" where hasRole('role1')) a
where a = 1 ;;
{code}