[JBoss JIRA] (TEIID-4766) Improve Inner Join performance queries when using MySQL 5 Translator
by Pedro Inácio (JIRA)
[ https://issues.jboss.org/browse/TEIID-4766?page=com.atlassian.jira.plugin... ]
Pedro Inácio updated TEIID-4766:
--------------------------------
Description:
For better understanding the performance problem, the description of the problem and possible enhancement will be done through a real example.
Having two tables defined in vdb:
* vodafone_nl
* numbering_plan
each having respectively: 1155 rows and 1,473,213 rows.
And also having each of these tables externally materialized in MySql in tables:
* vodafone_nl_cache
* numbering_plan_cache
The vodafone_nl table specification:
{code:sql}
CREATE VIEW vodafone_nl (
mcc varchar(5),
mnc varchar(5),
...
INDEX (mcc,mnc)
)
...
{code}
The numbering_plan table specification:
{code:sql}
CREATE TABLE numbering_plan (
mobile_country_code varchar(5),
mobile_network_code varchar(5),
...
INDEX (mobile_country_code,mobile_network_code)
)
...
{code}
The vodafone_nl_cache table specification:
{code:sql}
CREATE TABLE vodafone_nl (
mcc varchar(5),
mnc varchar(5),
...
INDEX (mcc,mnc)
)
...
{code}
The numbering_plan_cache table specification:
{code:sql}
CREATE TABLE numbering_plan_cache (
mobile_country_code varchar(5),
mobile_network_code varchar(5),
...
INDEX (mobile_country_code,mobile_network_code)
)
...
{code}
When executing the following query in a Client:
{code:sql}
SELECT COUNT(*)
FROM
VodafoneNl.vodafone_nl AS vnl
LEFT JOIN NumberingPlan.numbering_plan AS np
ON (np.mobile_country_code = vnl.mcc) AND (np.mobile_network_code = vnl.mnc)
{code}
Teiid Server will transform it in the following query:
{code:sql}
SELECT COUNT(*) AS c_0
FROM `mnom`.`vodafone_nl_cache` AS g_0
LEFT OUTER JOIN (SELECT g_1.`mobile_country_code` AS c_0, g_1.`mobile_network_code` AS c_1
FROM `mnom`.`numbering_plan_cache` AS g_1) AS v_0
ON v_0.c_0 = g_0.`mcc` AND v_0.c_1 = g_0.`mnc`
LIMIT 200
{code}
This query will take 22 seconds in our system.
If we do an explain statement in MySqlWorkbench we observe the following:
(please refer to TeiidQueryExplainPlan.png image)
There are two Full Index Scans, one returning 1155 rows and a second returning 1452482 rows followed by a Non-Unique Key Lookup.
If the exact same query is run directly in MySql the system only takes 0.984 seconds to respond.
{code:sql}
SELECT COUNT(*)
FROM
vodafone_nl_cache AS vnl
LEFT JOIN numbering_plan_cache AS np
ON (np.mobile_country_code = vnl.mcc) AND (np.mobile_network_code = vnl.mnc)
{code}
If we do an explain statement in MySqlWorkbench we observe the following:
(please refer to MySqlQueryExplainPlan.png image)
There is one Full Index Scan, returning 1155 rows followed by a Non-Unique Key Lookup.
Between the two queries there is a difference of 21 seconds.
So it is necessary to improve the way Teiid Server converts a Inner Join in MySQL to boost performance.
was:
For better understanding the performance problem, the description of the problem and possible enhancement will be done through a real example.
Having two tables defined in vdb:
* vodafone_nl
* numbering_plan
each having respectively: 1155 rows and 1,473,213 rows.
And also having each of these tables externally materialized in MySql in tables:
* vodafone_nl_cache
* numbering_plan_cache
The vodafone_nl table specification:
{code:sql}
CREATE VIEW vodafone_nl (
mcc varchar(5),
mnc varchar(5),
...
INDEX (mcc,mnc)
)
...
{code}
The numbering_plan table specification:
{code:sql}
CREATE TABLE numbering_plan (
mobile_country_code varchar(5),
mobile_network_code varchar(5),
...
INDEX (mobile_country_code,mobile_network_code)
)
...
{code}
The vodafone_nl_cache table specification:
{code:sql}
CREATE TABLE vodafone_nl (
mcc varchar(5),
mnc varchar(5),
...
INDEX (mcc,mnc)
)
...
{code}
The numbering_plan_cache table specification:
{code:sql}
CREATE TABLE numbering_plan_cache (
mobile_country_code varchar(5),
mobile_network_code varchar(5),
...
INDEX (mobile_country_code,mobile_network_code)
)
...
{code}
When executing the following query in a Client:
{code:sql}
SELECT COUNT(*)
FROM
VodafoneNl.vodafone_nl AS vnl
LEFT JOIN NumberingPlan.numbering_plan AS np
ON (np.mobile_country_code = vnl.mcc) AND (np.mobile_network_code = vnl.mnc)
{code}
Teiid Server will transform it in the following query:
{code:java}
SELECT COUNT(*) AS c_0
FROM `mnom`.`vodafone_nl_cache` AS g_0
LEFT OUTER JOIN (SELECT g_1.`mobile_country_code` AS c_0, g_1.`mobile_network_code` AS c_1
FROM `mnom`.`numbering_plan_cache` AS g_1) AS v_0
ON v_0.c_0 = g_0.`mcc` AND v_0.c_1 = g_0.`mnc`
LIMIT 200
{code}
This query will take 22 seconds in our system.
If we do an explain statement in MySqlWorkbench we observe the following:
(please refer to TeiidQueryExplainPlan.png image)
There are two Full Index Scans, one returning 1155 rows and a second returning 1452482 rows followed by a Non-Unique Key Lookup.
If the exact same query is run directly in MySql the system only takes 0.984 seconds to respond.
{code:sql}
SELECT COUNT(*)
FROM
vodafone_nl_cache AS vnl
LEFT JOIN numbering_plan_cache AS np
ON (np.mobile_country_code = vnl.mcc) AND (np.mobile_network_code = vnl.mnc)
{code}
If we do an explain statement in MySqlWorkbench we observe the following:
(please refer to MySqlQueryExplainPlan.png image)
There is one Full Index Scan, returning 1155 rows followed by a Non-Unique Key Lookup.
Between the two queries there is a difference of 21 seconds.
So it is necessary to improve the way Teiid Server converts a Inner Join in MySQL to boost performance.
> Improve Inner Join performance queries when using MySQL 5 Translator
> --------------------------------------------------------------------
>
> Key: TEIID-4766
> URL: https://issues.jboss.org/browse/TEIID-4766
> Project: Teiid
> Issue Type: Enhancement
> Affects Versions: 9.1.2
> Environment: * MySql 5.6.35
> * CentOs 7
> * Teiid 9.1.2
> Reporter: Pedro Inácio
> Assignee: Steven Hawkins
>
> For better understanding the performance problem, the description of the problem and possible enhancement will be done through a real example.
> Having two tables defined in vdb:
> * vodafone_nl
> * numbering_plan
> each having respectively: 1155 rows and 1,473,213 rows.
> And also having each of these tables externally materialized in MySql in tables:
> * vodafone_nl_cache
> * numbering_plan_cache
> The vodafone_nl table specification:
> {code:sql}
> CREATE VIEW vodafone_nl (
> mcc varchar(5),
> mnc varchar(5),
> ...
> INDEX (mcc,mnc)
> )
> ...
> {code}
> The numbering_plan table specification:
> {code:sql}
> CREATE TABLE numbering_plan (
> mobile_country_code varchar(5),
> mobile_network_code varchar(5),
> ...
> INDEX (mobile_country_code,mobile_network_code)
> )
> ...
> {code}
> The vodafone_nl_cache table specification:
> {code:sql}
> CREATE TABLE vodafone_nl (
> mcc varchar(5),
> mnc varchar(5),
> ...
> INDEX (mcc,mnc)
> )
> ...
> {code}
> The numbering_plan_cache table specification:
> {code:sql}
> CREATE TABLE numbering_plan_cache (
> mobile_country_code varchar(5),
> mobile_network_code varchar(5),
> ...
> INDEX (mobile_country_code,mobile_network_code)
> )
> ...
> {code}
> When executing the following query in a Client:
> {code:sql}
> SELECT COUNT(*)
> FROM
> VodafoneNl.vodafone_nl AS vnl
> LEFT JOIN NumberingPlan.numbering_plan AS np
> ON (np.mobile_country_code = vnl.mcc) AND (np.mobile_network_code = vnl.mnc)
> {code}
> Teiid Server will transform it in the following query:
> {code:sql}
> SELECT COUNT(*) AS c_0
> FROM `mnom`.`vodafone_nl_cache` AS g_0
> LEFT OUTER JOIN (SELECT g_1.`mobile_country_code` AS c_0, g_1.`mobile_network_code` AS c_1
> FROM `mnom`.`numbering_plan_cache` AS g_1) AS v_0
> ON v_0.c_0 = g_0.`mcc` AND v_0.c_1 = g_0.`mnc`
> LIMIT 200
> {code}
> This query will take 22 seconds in our system.
> If we do an explain statement in MySqlWorkbench we observe the following:
> (please refer to TeiidQueryExplainPlan.png image)
> There are two Full Index Scans, one returning 1155 rows and a second returning 1452482 rows followed by a Non-Unique Key Lookup.
> If the exact same query is run directly in MySql the system only takes 0.984 seconds to respond.
> {code:sql}
> SELECT COUNT(*)
> FROM
> vodafone_nl_cache AS vnl
> LEFT JOIN numbering_plan_cache AS np
> ON (np.mobile_country_code = vnl.mcc) AND (np.mobile_network_code = vnl.mnc)
> {code}
> If we do an explain statement in MySqlWorkbench we observe the following:
> (please refer to MySqlQueryExplainPlan.png image)
> There is one Full Index Scan, returning 1155 rows followed by a Non-Unique Key Lookup.
> Between the two queries there is a difference of 21 seconds.
> So it is necessary to improve the way Teiid Server converts a Inner Join in MySQL to boost performance.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
7 years, 10 months
[JBoss JIRA] (TEIID-4766) Improve Inner Join performance queries when using MySQL 5 Translator
by Pedro Inácio (JIRA)
Pedro Inácio created TEIID-4766:
-----------------------------------
Summary: Improve Inner Join performance queries when using MySQL 5 Translator
Key: TEIID-4766
URL: https://issues.jboss.org/browse/TEIID-4766
Project: Teiid
Issue Type: Enhancement
Affects Versions: 9.1.2
Environment: * MySql 5.6.35
* CentOs 7
* Teiid 9.1.2
Reporter: Pedro Inácio
Assignee: Steven Hawkins
For better understanding the performance problem, the description of the problem and possible enhancement will be done through a real example.
Having two tables defined in vdb:
* vodafone_nl
* numbering_plan
each having respectively: 1155 rows and 1,473,213 rows.
And also having each of these tables externally materialized in MySql in tables:
* vodafone_nl_cache
* numbering_plan_cache
The vodafone_nl table specification:
{code:sql}
CREATE VIEW vodafone_nl (
mcc varchar(5),
mnc varchar(5),
...
INDEX (mcc,mnc)
)
...
{code}
The numbering_plan table specification:
{code:sql}
CREATE TABLE numbering_plan (
mobile_country_code varchar(5),
mobile_network_code varchar(5),
...
INDEX (mobile_country_code,mobile_network_code)
)
...
{code}
The vodafone_nl_cache table specification:
{code:sql}
CREATE TABLE vodafone_nl (
mcc varchar(5),
mnc varchar(5),
...
INDEX (mcc,mnc)
)
...
{code}
The numbering_plan_cache table specification:
{code:sql}
CREATE TABLE numbering_plan_cache (
mobile_country_code varchar(5),
mobile_network_code varchar(5),
...
INDEX (mobile_country_code,mobile_network_code)
)
...
{code}
When executing the following query in a Client:
{code:sql}
SELECT COUNT(*)
FROM
VodafoneNl.vodafone_nl AS vnl
LEFT JOIN NumberingPlan.numbering_plan AS np
ON (np.mobile_country_code = vnl.mcc) AND (np.mobile_network_code = vnl.mnc)
{code}
Teiid Server will transform it in the following query:
{code:java}
SELECT COUNT(*) AS c_0
FROM `mnom`.`vodafone_nl_cache` AS g_0
LEFT OUTER JOIN (SELECT g_1.`mobile_country_code` AS c_0, g_1.`mobile_network_code` AS c_1
FROM `mnom`.`numbering_plan_cache` AS g_1) AS v_0
ON v_0.c_0 = g_0.`mcc` AND v_0.c_1 = g_0.`mnc`
LIMIT 200
{code}
This query will take 22 seconds in our system.
If we do an explain statement in MySqlWorkbench we observe the following:
(please refer to TeiidQueryExplainPlan.png image)
There are two Full Index Scans, one returning 1155 rows and a second returning 1452482 rows followed by a Non-Unique Key Lookup.
If the exact same query is run directly in MySql the system only takes 0.984 seconds to respond.
{code:sql}
SELECT COUNT(*)
FROM
vodafone_nl_cache AS vnl
LEFT JOIN numbering_plan_cache AS np
ON (np.mobile_country_code = vnl.mcc) AND (np.mobile_network_code = vnl.mnc)
{code}
If we do an explain statement in MySqlWorkbench we observe the following:
(please refer to MySqlQueryExplainPlan.png image)
There is one Full Index Scan, returning 1155 rows followed by a Non-Unique Key Lookup.
Between the two queries there is a difference of 21 seconds.
So it is necessary to improve the way Teiid Server converts a Inner Join in MySQL to boost performance.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
7 years, 10 months
[JBoss JIRA] (TEIID-4398) Write a utility to convert a .VDB with Index file into -vdb.xml or DDL format
by Bram Gadeyne (JIRA)
[ https://issues.jboss.org/browse/TEIID-4398?page=com.atlassian.jira.plugin... ]
Bram Gadeyne commented on TEIID-4398:
-------------------------------------
Hi,
Is it possible that this utility is not yet present in the CR1 release of teiid 9.2? I can not find it in the bin directory.
> Write a utility to convert a .VDB with Index file into -vdb.xml or DDL format
> -----------------------------------------------------------------------------
>
> Key: TEIID-4398
> URL: https://issues.jboss.org/browse/TEIID-4398
> Project: Teiid
> Issue Type: Task
> Components: Build/Kits
> Reporter: Ramesh Reddy
> Assignee: Ramesh Reddy
> Labels: CR1
> Fix For: 9.2
>
>
> Write a command line utility and provide in "bin" directory to convert the Designer based .vdb file with index metadata into -vdb.xml file and/or newer DDL format.
> This can be used to migrate the older VDBs mach easier to newer formats without use of Designer tooling.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
7 years, 10 months
[JBoss JIRA] (TEIID-4619) left join returns wrong results
by Bram Gadeyne (JIRA)
[ https://issues.jboss.org/browse/TEIID-4619?page=com.atlassian.jira.plugin... ]
Bram Gadeyne commented on TEIID-4619:
-------------------------------------
Hi Steven,
I have good news. I can confirm that this issue is resolved in 9.2.0 CR1.
> left join returns wrong results
> -------------------------------
>
> Key: TEIID-4619
> URL: https://issues.jboss.org/browse/TEIID-4619
> Project: Teiid
> Issue Type: Bug
> Affects Versions: 9.0.4, 9.0.5
> Reporter: Bram Gadeyne
> Assignee: Steven Hawkins
> Priority: Critical
> Fix For: 9.2.1
>
> Attachments: correct_result.txt, enclosed_queryplan.txt, query1_enclosed_plan.txt, query1_plan.txt, query2_plan.txt, teiid_reduced_case.txt, wrong_result.txt
>
>
> I have the following situation.
> I have a temporary table #tmp_admissions that contains 8047 rows.
> In this first query there are 66290 results. However if I only look at the lines for infectionid 880 then there are only 16 lines.
> {code:sql}
> select l.infectionid, l.id as linkid, lc.linkcultureid, lc.responsibleculture, lc.culturealternative, cl.sampleinsertts, cl.specimennumber,
> cl.culturenumber, cl.culturename, cl.quotation, ls.material, ls.sampletime,
> abr.culturenumber as abgram_culturenumber,abr.antibiogrampart, abr.resisttype,
> lc.antibiogramculturenr,lc.antibiogramspecimennr,lc.antibiogramsampleinsertts
> from #tmp_admissions adm
> join cos2_links l on l.admissionid = cast(adm.patientid as string)
> join cos2_link_culture lc on lc.linkid = l.id
> left join cos2_lab_culture cl on cl.culturenumber = lc.culturenr and cl.specimennumber = lc.culturespecimennr and cl.sampleinsertts = lc.culturesampleinsertts
> left join cos2_lab_sample ls on ls.inserttime = cl.sampleinsertts and ls.specimennumber = cl.specimennumber
> left join cos2_antibiogramresistences abr on abr.specimennumber = cl.specimennumber and abr.culturenumber = cl.culturenumber and abr.sampleinsertts = cl.sampleinsertts
> {code}
> This query does almost the same but returns 30 rows (and is correct).
> {code:sql}
> select l.infectionid, l.id as linkid, lc.linkcultureid, lc.responsibleculture, lc.culturealternative, cl.sampleinsertts, cl.specimennumber,
> cl.culturenumber, cl.culturename, cl.quotation, ls.material, ls.sampletime,
> abr.culturenumber as abgram_culturenumber,abr.antibiogrampart, abr.resisttype,
> lc.antibiogramculturenr,lc.antibiogramspecimennr,lc.antibiogramsampleinsertts
> from cos2_links l
> join cos2_link_culture lc on lc.linkid = l.id
> left join cos2_lab_culture cl on cl.culturenumber = lc.culturenr and cl.specimennumber = lc.culturespecimennr and cl.sampleinsertts = lc.culturesampleinsertts
> left join cos2_lab_sample ls on ls.inserttime = cl.sampleinsertts and ls.specimennumber = cl.specimennumber
> left join cos2_antibiogramresistences abr on abr.specimennumber = cl.specimennumber and abr.culturenumber = cl.culturenumber and abr.sampleinsertts = cl.sampleinsertts
> where l.infectionid = 880
> {code}
> cos2_link_culture contains 2 rows for this infectionid. The left join statements should result in 15 rows for both rows. However the left join results in the first query for the first row are null and to my understanding ignored. I'll attach the query plans for both queries.
> I should note that there is a one to many relation between infection and admission so therefore infectionid is for the same admission.
> Strangely enough if you enclode the first query in a group by query and count the rows it does indeed return 2 times 15 for the specific groups (see enclosed_queryplan.txt).
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
7 years, 10 months
[JBoss JIRA] (TEIID-4764) When using importer.tableTypes, Teiid will not create the status table
by Steven Hawkins (JIRA)
[ https://issues.jboss.org/browse/TEIID-4764?page=com.atlassian.jira.plugin... ]
Steven Hawkins commented on TEIID-4764:
---------------------------------------
I'm still confused. I still cannot see how this relates to tableTypes - can you what sources are involved or provide a complete vdb.
On startup we'll pull/load the metadata for each source. If a status table is referenced from one schema to another, it will exist as long as the schema containing the status table is defined first in the vdb.
> When using importer.tableTypes, Teiid will not create the status table
> ----------------------------------------------------------------------
>
> Key: TEIID-4764
> URL: https://issues.jboss.org/browse/TEIID-4764
> Project: Teiid
> Issue Type: Bug
> Components: Server
> Affects Versions: 8.12.9.6_3
> Reporter: Van Halbert
> Assignee: Steven Hawkins
>
> When using importer.tableTypes, Teiid will not create the status table.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
7 years, 10 months
[JBoss JIRA] (TEIID-3624) No way to associate enterprise data types in dynamic vdb constructs
by Steven Hawkins (JIRA)
[ https://issues.jboss.org/browse/TEIID-3624?page=com.atlassian.jira.plugin... ]
Steven Hawkins commented on TEIID-3624:
---------------------------------------
The initial work will add database level domains. This is similar to H2 and our existing handling - adding schema scoping will be a much larger undertaking. Also the initial commit will not allow domains to be used in cast/convert, but that should come later. Even with the types being database wide, we still have a resolving issue - TEIID-4765
> No way to associate enterprise data types in dynamic vdb constructs
> -------------------------------------------------------------------
>
> Key: TEIID-3624
> URL: https://issues.jboss.org/browse/TEIID-3624
> Project: Teiid
> Issue Type: Enhancement
> Components: Query Engine
> Reporter: Van Halbert
> Assignee: Steven Hawkins
> Fix For: 9.3
>
>
> I don't find any doc's on how to define enterprise data types in a dynamic vdb.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
7 years, 10 months
[JBoss JIRA] (TEIID-4765) The notion of resolving order may need to be extended beyond a schema
by Steven Hawkins (JIRA)
Steven Hawkins created TEIID-4765:
-------------------------------------
Summary: The notion of resolving order may need to be extended beyond a schema
Key: TEIID-4765
URL: https://issues.jboss.org/browse/TEIID-4765
Project: Teiid
Issue Type: Quality Risk
Components: Query Engine
Reporter: Steven Hawkins
Assignee: Steven Hawkins
Fix For: 9.3
With TEIID-4629 it is possible to interleave the definition of schema elements, which is not accounted for by the loading order - and will fail upon later validation/resolving.
create virtual schema first;
create virtual schema second;
set schema second;
create view x ...;
set schema first;
create view y as select * from x;
the last statement will fail later as we'll try to resolve the schemas in order.
A similar issue exists with create domain - TEIID-3624
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
7 years, 10 months