[
https://issues.jboss.org/browse/TEIID-5783?page=com.atlassian.jira.plugin...
]
Dmitrii Pogorelov updated TEIID-5783:
-------------------------------------
Steps to Reproduce:
1. Add file resource adapter in standalone-teiid.xml:
{code:xml}
<resource-adapter id="file">
<module slot="main"
id="org.jboss.teiid.resource-adapter.file"/>
<connection-definitions>
<connection-definition
class-name="org.teiid.resource.adapter.file.FileManagedConnectionFactory"
jndi-name="java:/files" enabled="true"
pool-name="files">
<config-property name="AllowParentPaths">
true
</config-property>
<config-property name="ParentDirectory">
C:/testdata
</config-property>
</connection-definition>
</connection-definitions>
</resource-adapter>
{code}
2. Add MySQL database configuration in standalone-teiid.xml:
{code:xml}
<datasource jndi-name="java:/test_dwh_my"
pool-name="test_dwh_my" enabled="true"
use-java-context="true">
<connection-url>jdbc:mysql://localhost:3306/dwh?zeroDateTimeBehavior=convertToNull</connection-url>
<driver>mysql</driver>
<new-connection-sql>set SESSION sql_mode =
'ANSI'</new-connection-sql>
<pool>
<min-pool-size>2</min-pool-size>
<max-pool-size>70</max-pool-size>
</pool>
<security>
<user-name>root</user-name>
<password>XXXXXX</password>
</security>
<validation>
<valid-connection-checker
class-name="org.jboss.jca.adapters.jdbc.extensions.mysql.MySQLValidConnectionChecker"/>
<exception-sorter
class-name="org.jboss.jca.adapters.jdbc.extensions.mysql.MySQLExceptionSorter"/>
</validation>
<timeout>
<blocking-timeout-millis>120000</blocking-timeout-millis>
<idle-timeout-minutes>5</idle-timeout-minutes>
</timeout>
</datasource>
{code}
3. Add in test-vdb.xml java:/test_dwh_my configured in previous step as datasource:
{code:xml}
<model name="dwh_my">
<property name="importer.useFullSchemaName"
value="false"/>
<property name="importer.tableTypes"
value="TABLE,VIEW"/>
<property name="importer.importKeys" value="false"/>
<source name="test_dwh_my" translator-name="mylobs"
connection-jndi-name="java:/test_dwh_my"/>
</model>
<translator name="mylobs" type="mysql5">
<property name="CopyLobs" value="true" />
<property name="SupportsNativeQueries" value="true"/>
</translator>
{code}
4. Configure in the test-vdb.xml the following virtual view:
{code:xml}
<model visible = "true" type = "VIRTUAL" name =
"views">
<metadata type = "DDL"><![CDATA[
create view vLeft as
SELECT
"csv_table"."created_at",
"csv_table"."order_id",
"csv_table"."store_id"
FROM
(call files.getFiles('test1.csv')) f,
TEXTTABLE(to_chars(f.file,'UTF-8')
COLUMNS
"created_at" STRING ,
"order_id" STRING ,
"store_id" STRING
DELIMITER ';'
QUOTE '"'
HEADER 1
)
"csv_table"
]]>
</metadata>
</model>
{code}
5. Load the data from the other file (test2.csv - attached to this issue) into a MySQL
table or a MS SQL table (vRight table name in my case, all fields have varchar types).
6. Run the statement below (rebuy_check column contains no NULLs, that's correct):
{code:sql}
select
count( a."order_id") anzahl_orders,
(case when b."order_id" is not null then 'Direct Rebuy' else
'Standard' end) rebuy_check
FROM views.vLeft a
left JOIN dwh_my.vRight b on b.order_id=a.order_id
group by
(case when b."order_id" is not null then 'Direct Rebuy' else
'Standard' end) ;;
{code}
7. Add a DISTINCT to the COUNT (rebuy_check column now contains NULLs, that's
incorrect):
{code:sql}
select
count(distinct a."order_id") anzahl_orders,
(case when b."order_id" is not null then 'Direct Rebuy' else
'Standard' end) rebuy_check
FROM views.vLeft a
left JOIN dwh_my.vRight b on b.order_id=a.order_id
group by
(case when b."order_id" is not null then 'Direct Rebuy' else
'Standard' end) ;;
{code}
8. Remove the COUNT entirely (rebuy_check column now contains NULLs, that's
incorrect):
{code:sql}
select
(case when b."order_id" is not null then 'Direct Rebuy' else
'Standard' end) rebuy_check
FROM views.vLeft a
left JOIN dwh_my.vRight b on b.order_id=a.order_id
group by
(case when b."order_id" is not null then 'Direct Rebuy' else
'Standard' end) ;;
{code}
was:
1. Add file resource adapter in standalone-teiid.xml:
{code:xml}
<resource-adapter id="file">
<module slot="main"
id="org.jboss.teiid.resource-adapter.file"/>
<connection-definitions>
<connection-definition
class-name="org.teiid.resource.adapter.file.FileManagedConnectionFactory"
jndi-name="java:/files" enabled="true"
pool-name="files">
<config-property name="AllowParentPaths">
true
</config-property>
<config-property name="ParentDirectory">
C:/testdata
</config-property>
</connection-definition>
</connection-definitions>
</resource-adapter>
{code}
2. Add MySQL database configuration in standalone-teiid.xml:
{code:xml}
<datasource jndi-name="java:/test_dwh_my"
pool-name="test_dwh_my" enabled="true"
use-java-context="true">
<connection-url>jdbc:mysql://localhost:3306/dwh?zeroDateTimeBehavior=convertToNull</connection-url>
<driver>mysql</driver>
<new-connection-sql>set SESSION sql_mode =
'ANSI'</new-connection-sql>
<pool>
<min-pool-size>2</min-pool-size>
<max-pool-size>70</max-pool-size>
</pool>
<security>
<user-name>root</user-name>
<password>XXXXXX</password>
</security>
<validation>
<valid-connection-checker
class-name="org.jboss.jca.adapters.jdbc.extensions.mysql.MySQLValidConnectionChecker"/>
<exception-sorter
class-name="org.jboss.jca.adapters.jdbc.extensions.mysql.MySQLExceptionSorter"/>
</validation>
<timeout>
<blocking-timeout-millis>120000</blocking-timeout-millis>
<idle-timeout-minutes>5</idle-timeout-minutes>
</timeout>
</datasource>
<datasource jndi-name="java:/test" pool-name="test"
enabled="true" use-java-context="true">
<connection-url>jdbc:mysql://localhost:3306/test_tables?zeroDateTimeBehavior=convertToNull</connection-url>
<driver>mysql</driver>
<new-connection-sql>set SESSION sql_mode =
'ANSI'</new-connection-sql>
<pool>
<min-pool-size>2</min-pool-size>
<max-pool-size>70</max-pool-size>
</pool>
<security>
<user-name>root</user-name>
<password>XXXXXX</password>
</security>
<validation>
<valid-connection-checker
class-name="org.jboss.jca.adapters.jdbc.extensions.mysql.MySQLValidConnectionChecker"/>
<exception-sorter
class-name="org.jboss.jca.adapters.jdbc.extensions.mysql.MySQLExceptionSorter"/>
</validation>
<timeout>
<blocking-timeout-millis>120000</blocking-timeout-millis>
<idle-timeout-minutes>5</idle-timeout-minutes>
</timeout>
</datasource>
{code}
3. Add in test-vdb.xml java:/test_dwh_my configured in previous step as datasource:
{code:xml}
<model name="dwh_my">
<property name="importer.useFullSchemaName"
value="false"/>
<property name="importer.tableTypes"
value="TABLE,VIEW"/>
<property name="importer.importKeys" value="false"/>
<source name="test_dwh_my" translator-name="mylobs"
connection-jndi-name="java:/test_dwh_my"/>
</model>
<translator name="mylobs" type="mysql5">
<property name="CopyLobs" value="true" />
<property name="SupportsNativeQueries" value="true"/>
</translator>
{code}
4. Configure in the test-vdb.xml the following virtual view:
{code:xml}
<model visible = "true" type = "VIRTUAL" name =
"views">
<metadata type = "DDL"><![CDATA[
create view vLeft as
SELECT
"csv_table"."created_at",
"csv_table"."order_id",
"csv_table"."store_id"
FROM
(call files.getFiles('test1.csv')) f,
TEXTTABLE(to_chars(f.file,'UTF-8')
COLUMNS
"created_at" STRING ,
"order_id" STRING ,
"store_id" STRING
DELIMITER ';'
QUOTE '"'
HEADER 1
)
"csv_table"
]]>
</metadata>
</model>
{code}
5. Load the data from the other file (test2.csv - attached to this issue) into a MySQL
table or a MS SQL table (vRight table name in my case, all fields have varchar types).
6. Run the statement below (rebuy_check column contains no NULLs, that's correct):
{code:sql}
select
count( a."order_id") anzahl_orders,
(case when b."order_id" is not null then 'Direct Rebuy' else
'Standard' end) rebuy_check
FROM views.vLeft a
left JOIN dwh_my.vRight b on b.order_id=a.order_id
group by
(case when b."order_id" is not null then 'Direct Rebuy' else
'Standard' end) ;;
{code}
7. Add a DISTINCT to the COUNT (rebuy_check column now contains NULLs, that's
incorrect):
{code:sql}
select
count(distinct a."order_id") anzahl_orders,
(case when b."order_id" is not null then 'Direct Rebuy' else
'Standard' end) rebuy_check
FROM views.vLeft a
left JOIN dwh_my.vRight b on b.order_id=a.order_id
group by
(case when b."order_id" is not null then 'Direct Rebuy' else
'Standard' end) ;;
{code}
8. Remove the COUNT entirely (rebuy_check column now contains NULLs, that's
incorrect):
{code:sql}
select
(case when b."order_id" is not null then 'Direct Rebuy' else
'Standard' end) rebuy_check
FROM views.vLeft a
left JOIN dwh_my.vRight b on b.order_id=a.order_id
group by
(case when b."order_id" is not null then 'Direct Rebuy' else
'Standard' end) ;;
{code}
OUTER JOIN Yields Wrong Results When Used With GROUP BY, CASE and
COUNT DISTINCT
--------------------------------------------------------------------------------
Key: TEIID-5783
URL:
https://issues.jboss.org/browse/TEIID-5783
Project: Teiid
Issue Type: Bug
Components: Query Engine
Affects Versions: 12.0
Environment: teiid-12.0.0 on WildFly Full 14.0.1.Final (WildFly Core
6.0.2.Final)
Reporter: Dmitrii Pogorelov
Assignee: Steven Hawkins
Priority: Blocker
Attachments: test1.csv, test2.csv
Some queries that involve an OUTER JOIN between objects from different data sources, a
GROUP BY with a CASE statement and a COUNT DISTINCT yield inconsistent results on a SELECT
expression which is identical to the GROUP BY expression with CASE.
* Reproduced in combination of <File> LEFT JOIN <MySQL> and <File> LEFT
JOIN <MS SQL> and the same for the other OUTER JOIN variants.
* screenshot for the different queries are attached
* sample data for reproduction is attached
The same problem happens with LEFT OUTER JOIN, RIGHT OUTER JOIN and FULL OUTER JOIN.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)