[JBoss JIRA] (TEIID-3580) Hive 0.13.1 JDBC jars makes queries run slow
by Steven Hawkins (JIRA)
[ https://issues.jboss.org/browse/TEIID-3580?page=com.atlassian.jira.plugin... ]
Steven Hawkins commented on TEIID-3580:
---------------------------------------
Just to make sure it's the same comparison, is the webapp running "SELECT g_0.code, g_0.description, g_0.total_emp, g_0.salary FROM HDP1.sample_07 AS g_0" using the Hive Statement.executeQuery(String) method?
> Hive 0.13.1 JDBC jars makes queries run slow
> --------------------------------------------
>
> Key: TEIID-3580
> URL: https://issues.jboss.org/browse/TEIID-3580
> Project: Teiid
> Issue Type: Bug
> Components: Query Engine
> Affects Versions: 8.7.3
> Reporter: Debbie Steigner
> Assignee: Steven Hawkins
> Attachments: threaddump-1437508266001.tdump
>
>
> When using the JDBC jars for Hive 0.13.1 running on HDP 2.1, queries executed against table 'default.sample_07' takes approximately 20-30 seconds to return.
> The Hive JDBC jars for version 0.13.1 can be found here :
> https://github.com/vchintal/hive-jdbc-jars-archive
> Alternatively a ready-to-go modules can be downloaded from here for testing:
> https://drive.google.com/file/d/0BxJhoZ1V34QHSmgzTlBRVktZaGM/
> Use the following driver snippet when using the above mentioned module:
> <driver name="hive" module="org.apache.hadoop.hive:0.13.1">
> <driver-class>org.apache.hive.jdbc.HiveDriver</driver-class>
> </driver>
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
9 years, 5 months
[JBoss JIRA] (TEIID-3580) Hive 0.13.1 JDBC jars makes queries run slow
by Vijay Bhaskar Chintalapati (JIRA)
[ https://issues.jboss.org/browse/TEIID-3580?page=com.atlassian.jira.plugin... ]
Vijay Bhaskar Chintalapati commented on TEIID-3580:
---------------------------------------------------
Yes its the same data source used both by web app and Teiid together. The query is 'select * from HDP.sample_07'. HDP here is the Source Model I am exposing in the VDB. So there is no difference in query plan. Below is part of the server log:
EXECUTING CalculateCost
AFTER:
Access(groups=[HDP1.sample_07], props={SOURCE_HINT=null, MODEL_ID=Schema name=HDP1, nameInSource=null, uuid=mmuuid:004c32d9-8675-49cb-88ab-9d3635e08885, OUTPUT_COLS=[HDP1.sample_07.code, HDP1.sample_07.description, HDP1.sample_07.total_emp, HDP1.sample_07.salary], EST_CARDINALITY=-1.0})
Project(groups=[HDP1.sample_07], props={PROJECT_COLS=[HDP1.sample_07.code, HDP1.sample_07.description, HDP1.sample_07.total_emp, HDP1.sample_07.salary], OUTPUT_COLS=[HDP1.sample_07.code, HDP1.sample_07.description, HDP1.sample_07.total_emp, HDP1.sample_07.salary], EST_CARDINALITY=-1.0})
Source(groups=[HDP1.sample_07], props={OUTPUT_COLS=[HDP1.sample_07.code, HDP1.sample_07.description, HDP1.sample_07.total_emp, HDP1.sample_07.salary], EST_COL_STATS={HDP1.sample_07.code=[-1.0, -1.0], HDP1.sample_07.description=[-1.0, -1.0], HDP1.sample_07.total_emp=[-1.0, -1.0], HDP1.sample_07.salary=[-1.0, -1.0]}, EST_CARDINALITY=-1.0})
============================================================================
EXECUTING PlanSorts
AFTER:
Access(groups=[HDP1.sample_07])
Project(groups=[HDP1.sample_07])
Source(groups=[HDP1.sample_07])
============================================================================
EXECUTING CollapseSource
AFTER:
Access(groups=[HDP1.sample_07], props={SOURCE_HINT=null, MODEL_ID=Schema name=HDP1, nameInSource=null, uuid=mmuuid:004c32d9-8675-49cb-88ab-9d3635e08885, OUTPUT_COLS=[HDP1.sample_07.code, HDP1.sample_07.description, HDP1.sample_07.total_emp, HDP1.sample_07.salary], EST_CARDINALITY=-1.0, ATOMIC_REQUEST=SELECT HDP1.sample_07.code, HDP1.sample_07.description, HDP1.sample_07.total_emp, HDP1.sample_07.salary FROM HDP1.sample_07})
============================================================================
CONVERTING PLAN TREE TO PROCESS TREE
PROCESS PLAN =
AccessNode(0) output=[HDP1.sample_07.code, HDP1.sample_07.description, HDP1.sample_07.total_emp, HDP1.sample_07.salary] SELECT g_0.code, g_0.description, g_0.total_emp, g_0.salary FROM HDP1.sample_07 AS g_0
============================================================================
----------------------------------------------------------------------------
OPTIMIZATION COMPLETE:
PROCESSOR PLAN:
AccessNode(0) output=[HDP1.sample_07.code, HDP1.sample_07.description, HDP1.sample_07.total_emp, HDP1.sample_07.salary] SELECT g_0.code, g_0.description, g_0.total_emp, g_0.salary FROM HDP1.sample_07 AS g_0
============================================================================
> Hive 0.13.1 JDBC jars makes queries run slow
> --------------------------------------------
>
> Key: TEIID-3580
> URL: https://issues.jboss.org/browse/TEIID-3580
> Project: Teiid
> Issue Type: Bug
> Components: Query Engine
> Affects Versions: 8.7.3
> Reporter: Debbie Steigner
> Assignee: Steven Hawkins
> Attachments: threaddump-1437508266001.tdump
>
>
> When using the JDBC jars for Hive 0.13.1 running on HDP 2.1, queries executed against table 'default.sample_07' takes approximately 20-30 seconds to return.
> The Hive JDBC jars for version 0.13.1 can be found here :
> https://github.com/vchintal/hive-jdbc-jars-archive
> Alternatively a ready-to-go modules can be downloaded from here for testing:
> https://drive.google.com/file/d/0BxJhoZ1V34QHSmgzTlBRVktZaGM/
> Use the following driver snippet when using the above mentioned module:
> <driver name="hive" module="org.apache.hadoop.hive:0.13.1">
> <driver-class>org.apache.hive.jdbc.HiveDriver</driver-class>
> </driver>
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
9 years, 5 months
[JBoss JIRA] (TEIID-3580) Hive 0.13.1 JDBC jars makes queries run slow
by Steven Hawkins (JIRA)
[ https://issues.jboss.org/browse/TEIID-3580?page=com.atlassian.jira.plugin... ]
Steven Hawkins commented on TEIID-3580:
---------------------------------------
Is this the same EAP / datasource you were using for your web app? The threaddump just confirms that it's a normal (non-prepared) query execution from the perspective of Teiid. What is the query you are executing, and just to make sure have you confirmed that the query plan is pushing what you expect to the source.
> Hive 0.13.1 JDBC jars makes queries run slow
> --------------------------------------------
>
> Key: TEIID-3580
> URL: https://issues.jboss.org/browse/TEIID-3580
> Project: Teiid
> Issue Type: Bug
> Components: Query Engine
> Affects Versions: 8.7.3
> Reporter: Debbie Steigner
> Assignee: Steven Hawkins
> Attachments: threaddump-1437508266001.tdump
>
>
> When using the JDBC jars for Hive 0.13.1 running on HDP 2.1, queries executed against table 'default.sample_07' takes approximately 20-30 seconds to return.
> The Hive JDBC jars for version 0.13.1 can be found here :
> https://github.com/vchintal/hive-jdbc-jars-archive
> Alternatively a ready-to-go modules can be downloaded from here for testing:
> https://drive.google.com/file/d/0BxJhoZ1V34QHSmgzTlBRVktZaGM/
> Use the following driver snippet when using the above mentioned module:
> <driver name="hive" module="org.apache.hadoop.hive:0.13.1">
> <driver-class>org.apache.hive.jdbc.HiveDriver</driver-class>
> </driver>
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
9 years, 5 months
[JBoss JIRA] (TEIID-3580) Hive 0.13.1 JDBC jars makes queries run slow
by Vijay Bhaskar Chintalapati (JIRA)
[ https://issues.jboss.org/browse/TEIID-3580?page=com.atlassian.jira.plugin... ]
Vijay Bhaskar Chintalapati updated TEIID-3580:
----------------------------------------------
Attachment: threaddump-1437508266001.tdump
Attached is the thread dump of the JBoss Data Virtualization runtime midway thru the query execution. The query took 20 seconds and the thread dump was taken around 10 seconds when the query was still executing.
> Hive 0.13.1 JDBC jars makes queries run slow
> --------------------------------------------
>
> Key: TEIID-3580
> URL: https://issues.jboss.org/browse/TEIID-3580
> Project: Teiid
> Issue Type: Bug
> Components: Query Engine
> Affects Versions: 8.7.3
> Reporter: Debbie Steigner
> Assignee: Steven Hawkins
> Attachments: threaddump-1437508266001.tdump
>
>
> When using the JDBC jars for Hive 0.13.1 running on HDP 2.1, queries executed against table 'default.sample_07' takes approximately 20-30 seconds to return.
> The Hive JDBC jars for version 0.13.1 can be found here :
> https://github.com/vchintal/hive-jdbc-jars-archive
> Alternatively a ready-to-go modules can be downloaded from here for testing:
> https://drive.google.com/file/d/0BxJhoZ1V34QHSmgzTlBRVktZaGM/
> Use the following driver snippet when using the above mentioned module:
> <driver name="hive" module="org.apache.hadoop.hive:0.13.1">
> <driver-class>org.apache.hive.jdbc.HiveDriver</driver-class>
> </driver>
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
9 years, 5 months