... |
* Object Table - Evaluates OBJECTTABLE |
h23. Node Statistics |
Every node has a set of statistics that are output. These can be used to determine the amount of data flowing through the node. Before execution a processor plan will not contain node statistics. Also the statistics are updated as the plan is processed, so typically you'll want the final statistics after all rows have been processed by the client. |
... |
| Data Bytes Sent | The size of the serialized data result (row and lob values) sent to the client | bytes | |
h23. Reading a Processor Plan |
|
The processor plan can be obtained in a plain text or xml format. The plan text format is typically easier to read, while the xml format is easier to process by tooling. When possible tooling should be used to examine the plans as the tree structures can be deeply nested. Data flows from the leafs of the tree to the root. Sub plans for procedure execution can be shown inline, and are differentiated by different indentation. Given a user query of "SELECT pm1.g1.e1, pm1.g2.e2, pm1.g3.e3 from pm1.g1 inner join (pm1.g2 left outer join pm1.g3 on pm1.g2.e1=pm1.g3.e1) on pm1.g1.e1=pm1.g3.e1" the text for a processor plan that does not push down the joins would look like: |
{code} |
ProjectNode + Output Columns: 0: e1 (string) 1: e2 (integer) 2: e3 (boolean) + Cost Estimates:Estimated Node Cardinality: -1.0 + Child 0: JoinNode + Output Columns: 0: e1 (string) 1: e2 (integer) 2: e3 (boolean) + Cost Estimates:Estimated Node Cardinality: -1.0 + Child 0: JoinNode + Output Columns: 0: e1 (string) 1: e1 (string) 2: e3 (boolean) + Cost Estimates:Estimated Node Cardinality: -1.0 + Child 0: AccessNode + Output Columns:e1 (string) + Cost Estimates:Estimated Node Cardinality: -1.0 + Query:SELECT g_0.e1 AS c_0 FROM pm1.g1 AS g_0 ORDER BY c_0 + Model Name:pm1 + Child 1: AccessNode + Output Columns: 0: e1 (string) 1: e3 (boolean) + Cost Estimates:Estimated Node Cardinality: -1.0 + Query:SELECT g_0.e1 AS c_0, g_0.e3 AS c_1 FROM pm1.g3 AS g_0 ORDER BY c_0 + Model Name:pm1 + Join Strategy:MERGE JOIN (ALREADY_SORTED/ALREADY_SORTED) + Join Type:INNER JOIN + Join Criteria:pm1.g1.e1=pm1.g3.e1 + Child 1: AccessNode + Output Columns: 0: e1 (string) 1: e2 (integer) + Cost Estimates:Estimated Node Cardinality: -1.0 + Query:SELECT g_0.e1 AS c_0, g_0.e2 AS c_1 FROM pm1.g2 AS g_0 ORDER BY c_0 + Model Name:pm1 + Join Strategy:ENHANCED SORT JOIN (SORT/ALREADY_SORTED) + Join Type:INNER JOIN + Join Criteria:pm1.g3.e1=pm1.g2.e1 + Select Columns: 0: pm1.g1.e1 1: pm1.g2.e2 2: pm1.g3.e3 {code} Note that the nested join node is using a merge join and expects the source queries from each side to produce the expected ordering for the join. The parent join is an enhanced sort join which can delay the decision to perform sorting based upon the incoming rows. Note that the outer join from the user query has been modified to an inner join since none of the null inner values can be present in the query result. The same plan in xml form looks like: {code} <?xml version="1.0" encoding="UTF-8"?> <node name="ProjectNode"> <property name="Output Columns"> <value>e1 (string)</value> <value>e2 (integer)</value> <value>e3 (boolean)</value> </property> <property name="Cost Estimates"> <value>Estimated Node Cardinality: -1.0</value> </property> <property name="Child 0"> <node name="JoinNode"> <property name="Output Columns"> <value>e1 (string)</value> <value>e2 (integer)</value> <value>e3 (boolean)</value> </property> <property name="Cost Estimates"> <value>Estimated Node Cardinality: -1.0</value> </property> <property name="Child 0"> <node name="JoinNode"> <property name="Output Columns"> <value>e1 (string)</value> <value>e1 (string)</value> <value>e3 (boolean)</value> </property> <property name="Cost Estimates"> <value>Estimated Node Cardinality: -1.0</value> </property> <property name="Child 0"> <node name="AccessNode"> <property name="Output Columns"> <value>e1 (string)</value> </property> <property name="Cost Estimates"> <value>Estimated Node Cardinality: -1.0</value> </property> <property name="Query"> <value>SELECT g_0.e1 AS c_0 FROM pm1.g1 AS g_0 ORDER BY c_0</value> </property> <property name="Model Name"> <value>pm1</value> </property> </node> </property> <property name="Child 1"> <node name="AccessNode"> <property name="Output Columns"> <value>e1 (string)</value> <value>e3 (boolean)</value> </property> <property name="Cost Estimates"> <value>Estimated Node Cardinality: -1.0</value> </property> <property name="Query"> <value>SELECT g_0.e1 AS c_0, g_0.e3 AS c_1 FROM pm1.g3 AS g_0 ORDER BY c_0</value> </property> <property name="Model Name"> <value>pm1</value> </property> </node> </property> <property name="Join Strategy"> <value>MERGE JOIN (ALREADY_SORTED/ALREADY_SORTED)</value> </property> <property name="Join Type"> <value>INNER JOIN</value> </property> <property name="Join Criteria"> <value>pm1.g1.e1=pm1.g3.e1</value> </property> </node> </property> <property name="Child 1"> <node name="AccessNode"> <property name="Output Columns"> <value>e1 (string)</value> <value>e2 (integer)</value> </property> <property name="Cost Estimates"> <value>Estimated Node Cardinality: -1.0</value> </property> <property name="Query"> <value>SELECT g_0.e1 AS c_0, g_0.e2 AS c_1 FROM pm1.g2 AS g_0 ORDER BY c_0</value> </property> <property name="Model Name"> <value>pm1</value> </property> </node> </property> <property name="Join Strategy"> <value>ENHANCED SORT JOIN (SORT/ALREADY_SORTED)</value> </property> <property name="Join Type"> <value>INNER JOIN</value> </property> <property name="Join Criteria"> <value>pm1.g3.e1=pm1.g2.e1</value> </property> </node> </property> <property name="Select Columns"> <value>pm1.g1.e1</value> <value>pm1.g2.e2</value> <value>pm1.g3.e3</value> </property> </node> {code} Note that the same information appears in each of the plan forms. In some cases it can actually be easier to follow the simplified format of the debug plan final processor plan. From the [#Debug Log] the same plan as above would appear as: {code} OPTIMIZATION COMPLETE: PROCESSOR PLAN: |
ProjectNode(0) output=[pm1.g1.e1, pm1.g2.e2, pm1.g3.e3] [pm1.g1.e1, pm1.g2.e2, pm1.g3.e3] JoinNode(1) [ENHANCED SORT JOIN (SORT/ALREADY_SORTED)] [INNER JOIN] criteria=[pm1.g3.e1=pm1.g2.e1] output=[pm1.g1.e1, pm1.g2.e2, pm1.g3.e3] |
... |
XML document model queries and proecedure execution (including instead of triggers) use intermediate and final plan forms that include relational plans. Generally the structure of the xml/procedure plans will closely match their logical forms. It's the nested relational plans that will be of interest when analyzing performance issues. |
h1. Debug Plans Log |
A relational processing plan is created by the optimizer after the logical plan is manipulated by a series of rules. The application of rules is determined both by the query structure and by the rules themselves. The node structure of the debug plan resembles that of the processing plan, but the node types more logically represent SQL operations. |
h23. All Nodes |
* ACCESS - a source access or plan execution. |
... |
Note that the affect of grouping generates what is effectively an inline view, anon_grp0, to handle the projection of values created by the grouping. |
h23. Node Properties |
Each node has a set of applicable properties that are typically shown on the node. |
... |
** EST_SELECTIVITY - represents the selectivity of a criteria node |
h23. Rules |
Plan rule manipulate the plan tree, fire other rules, and drive the optimization process. The structure of the query determines the initial set of rules. Each rule is designed to perform a narrow set of tasks. Some rules can be run multiple times. Some rules require a specific set of precursors to run properly. |
... |
* Validate Where All - ensures criteria is used when required by the source |
h23. Reading a Debug Plan |
As each relational sub plan is optimized, the plan will show what is being optimized and it's canonical form: |
... |
{code} |
h23. XQuery |
XQuery is eligible for specific [optimizations|XQuery Optimization]. Document projection is the most common optimization. It will be shown in the debug plan as an annotation. For example with the user query containing "xmltable('/a/b' passing doc columns x string path '@x', val string path '/.')", the debug plan would show a tree of the document that will effectively be used by the context and path XQuerys: |
... |
When integrating information using a federated query planner it is useful to view the query plans to better understand how information is being accessed and processed, and to troubleshoot problems.
A query plan is a set of instructions created by a query engine for executing a command submitted by a user or application. The purpose of the query plan is to execute the user's query in as efficient a way as possible.
You can get a query plan any time you execute a command. The SQL options available are as follows:
SET SHOWPLAN [ON|DEBUG]- Returns the processing plan or the plan and the full planner debug log. See also the SET Statement.
With the above options, the query plan is available from the Statement object by casting to the org.teiid.jdbc.TeiidStatement interface or by using the "SHOW PLAN" statement.
statement.execute("set showplan on"); ResultSet rs = statement.executeQuery("select ..."); TeiidStatement tstatement = statement.unwrap(TeiidStatement.class); PlanNode queryPlan = tstatement.getPlanDescription(); System.out.println(queryPlan);
The query plan is made available automatically in several of Teiid's tools.
Once a query plan has been obtained you will most commonly be looking for:
All of the above information can be determined from the processing plan. You will typically be interested in analyzing the textual form of the final processing plan. To understand why particular decisions are made for debugging or support you will want to obtain the full debug log which will contain the intermediate planning steps as well as annotations as to why specific pushdown decisions are made.
A query plan consists of a set of nodes organized in a tree structure. If you are executing a procedure or generating an XML document from an XML Document Model, the overall query plan will contain additional information related the surrounding procedural execution.
In a procedural context the ordering of child nodes implies the order of execution. In most other situation, child nodes may be executed in any order even in parallel. Only in specific optimizations, such as dependent join, will the children of a join execute serially.
Relational plans represent the processing plan that is composed of nodes representing building blocks of logical relational operations. Relational processing plans differ from logical debug relational plans in that they will contain additional operations and execution specifics that were chosen by the optimizer.
The nodes for a relational query plan are:
Every node has a set of statistics that are output. These can be used to determine the amount of data flowing through the node. Before execution a processor plan will not contain node statistics. Also the statistics are updated as the plan is processed, so typically you'll want the final statistics after all rows have been processed by the client.
Statistic | Description | Units |
---|---|---|
Node Output Rows | Number of records output from the node | count |
Node Process Time | Time processing in this node only | millisec |
Node Cumulative Process Time | Elapsed time from beginning of processing to end | millisec |
Node Cumulative Next Batch Process Time | Time processing in this node + child nodes | millisec |
Node Next Batch Calls | Number of times a node was called for processing | count |
Node Blocks | Number of times a blocked exception was thrown by this node or a child | count |
In addition to node statistics, some nodes display cost estimates computed at the node.
Cost Estimates | Description | Units |
---|---|---|
Estimated Node Cardinality | Estimated number of records that will be output from the node; -1 if unknown | count |
The root node will display additional information.
Top level Statistics | Description | Units |
---|---|---|
Data Bytes Sent | The size of the serialized data result (row and lob values) sent to the client | bytes |
The processor plan can be obtained in a plain text or xml format. The plan text format is typically easier to read, while the xml format is easier to process by tooling. When possible tooling should be used to examine the plans as the tree structures can be deeply nested.
Data flows from the leafs of the tree to the root. Sub plans for procedure execution can be shown inline, and are differentiated by different indentation. Given a user query of "SELECT pm1.g1.e1, pm1.g2.e2, pm1.g3.e3 from pm1.g1 inner join (pm1.g2 left outer join pm1.g3 on pm1.g2.e1=pm1.g3.e1) on pm1.g1.e1=pm1.g3.e1" the text for a processor plan that does not push down the joins would look like:
ProjectNode + Output Columns: 0: e1 (string) 1: e2 (integer) 2: e3 (boolean) + Cost Estimates:Estimated Node Cardinality: -1.0 + Child 0: JoinNode + Output Columns: 0: e1 (string) 1: e2 (integer) 2: e3 (boolean) + Cost Estimates:Estimated Node Cardinality: -1.0 + Child 0: JoinNode + Output Columns: 0: e1 (string) 1: e1 (string) 2: e3 (boolean) + Cost Estimates:Estimated Node Cardinality: -1.0 + Child 0: AccessNode + Output Columns:e1 (string) + Cost Estimates:Estimated Node Cardinality: -1.0 + Query:SELECT g_0.e1 AS c_0 FROM pm1.g1 AS g_0 ORDER BY c_0 + Model Name:pm1 + Child 1: AccessNode + Output Columns: 0: e1 (string) 1: e3 (boolean) + Cost Estimates:Estimated Node Cardinality: -1.0 + Query:SELECT g_0.e1 AS c_0, g_0.e3 AS c_1 FROM pm1.g3 AS g_0 ORDER BY c_0 + Model Name:pm1 + Join Strategy:MERGE JOIN (ALREADY_SORTED/ALREADY_SORTED) + Join Type:INNER JOIN + Join Criteria:pm1.g1.e1=pm1.g3.e1 + Child 1: AccessNode + Output Columns: 0: e1 (string) 1: e2 (integer) + Cost Estimates:Estimated Node Cardinality: -1.0 + Query:SELECT g_0.e1 AS c_0, g_0.e2 AS c_1 FROM pm1.g2 AS g_0 ORDER BY c_0 + Model Name:pm1 + Join Strategy:ENHANCED SORT JOIN (SORT/ALREADY_SORTED) + Join Type:INNER JOIN + Join Criteria:pm1.g3.e1=pm1.g2.e1 + Select Columns: 0: pm1.g1.e1 1: pm1.g2.e2 2: pm1.g3.e3
Note that the nested join node is using a merge join and expects the source queries from each side to produce the expected ordering for the join. The parent join is an enhanced sort join which can delay the decision to perform sorting based upon the incoming rows. Note that the outer join from the user query has been modified to an inner join since none of the null inner values can be present in the query result.
The same plan in xml form looks like:
<?xml version="1.0" encoding="UTF-8"?> <node name="ProjectNode"> <property name="Output Columns"> <value>e1 (string)</value> <value>e2 (integer)</value> <value>e3 (boolean)</value> </property> <property name="Cost Estimates"> <value>Estimated Node Cardinality: -1.0</value> </property> <property name="Child 0"> <node name="JoinNode"> <property name="Output Columns"> <value>e1 (string)</value> <value>e2 (integer)</value> <value>e3 (boolean)</value> </property> <property name="Cost Estimates"> <value>Estimated Node Cardinality: -1.0</value> </property> <property name="Child 0"> <node name="JoinNode"> <property name="Output Columns"> <value>e1 (string)</value> <value>e1 (string)</value> <value>e3 (boolean)</value> </property> <property name="Cost Estimates"> <value>Estimated Node Cardinality: -1.0</value> </property> <property name="Child 0"> <node name="AccessNode"> <property name="Output Columns"> <value>e1 (string)</value> </property> <property name="Cost Estimates"> <value>Estimated Node Cardinality: -1.0</value> </property> <property name="Query"> <value>SELECT g_0.e1 AS c_0 FROM pm1.g1 AS g_0 ORDER BY c_0</value> </property> <property name="Model Name"> <value>pm1</value> </property> </node> </property> <property name="Child 1"> <node name="AccessNode"> <property name="Output Columns"> <value>e1 (string)</value> <value>e3 (boolean)</value> </property> <property name="Cost Estimates"> <value>Estimated Node Cardinality: -1.0</value> </property> <property name="Query"> <value>SELECT g_0.e1 AS c_0, g_0.e3 AS c_1 FROM pm1.g3 AS g_0 ORDER BY c_0</value> </property> <property name="Model Name"> <value>pm1</value> </property> </node> </property> <property name="Join Strategy"> <value>MERGE JOIN (ALREADY_SORTED/ALREADY_SORTED)</value> </property> <property name="Join Type"> <value>INNER JOIN</value> </property> <property name="Join Criteria"> <value>pm1.g1.e1=pm1.g3.e1</value> </property> </node> </property> <property name="Child 1"> <node name="AccessNode"> <property name="Output Columns"> <value>e1 (string)</value> <value>e2 (integer)</value> </property> <property name="Cost Estimates"> <value>Estimated Node Cardinality: -1.0</value> </property> <property name="Query"> <value>SELECT g_0.e1 AS c_0, g_0.e2 AS c_1 FROM pm1.g2 AS g_0 ORDER BY c_0</value> </property> <property name="Model Name"> <value>pm1</value> </property> </node> </property> <property name="Join Strategy"> <value>ENHANCED SORT JOIN (SORT/ALREADY_SORTED)</value> </property> <property name="Join Type"> <value>INNER JOIN</value> </property> <property name="Join Criteria"> <value>pm1.g3.e1=pm1.g2.e1</value> </property> </node> </property> <property name="Select Columns"> <value>pm1.g1.e1</value> <value>pm1.g2.e2</value> <value>pm1.g3.e3</value> </property> </node>
Note that the same information appears in each of the plan forms. In some cases it can actually be easier to follow the simplified format of the debug plan final processor plan. From the Debug Log the same plan as above would appear as:
OPTIMIZATION COMPLETE: PROCESSOR PLAN: ProjectNode(0) output=[pm1.g1.e1, pm1.g2.e2, pm1.g3.e3] [pm1.g1.e1, pm1.g2.e2, pm1.g3.e3] JoinNode(1) [ENHANCED SORT JOIN (SORT/ALREADY_SORTED)] [INNER JOIN] criteria=[pm1.g3.e1=pm1.g2.e1] output=[pm1.g1.e1, pm1.g2.e2, pm1.g3.e3] JoinNode(2) [MERGE JOIN (ALREADY_SORTED/ALREADY_SORTED)] [INNER JOIN] criteria=[pm1.g1.e1=pm1.g3.e1] output=[pm1.g3.e1, pm1.g1.e1, pm1.g3.e3] AccessNode(3) output=[pm1.g1.e1] SELECT g_0.e1 AS c_0 FROM pm1.g1 AS g_0 ORDER BY c_0 AccessNode(4) output=[pm1.g3.e1, pm1.g3.e3] SELECT g_0.e1 AS c_0, g_0.e3 AS c_1 FROM pm1.g3 AS g_0 ORDER BY c_0 AccessNode(5) output=[pm1.g2.e1, pm1.g2.e2] SELECT g_0.e1 AS c_0, g_0.e2 AS c_1 FROM pm1.g2 AS g_0 ORDER BY c_0
XML document model queries and proecedure execution (including instead of triggers) use intermediate and final plan forms that include relational plans. Generally the structure of the xml/procedure plans will closely match their logical forms. It's the nested relational plans that will be of interest when analyzing performance issues.
A relational processing plan is created by the optimizer after the logical plan is manipulated by a series of rules. The application of rules is determined both by the query structure and by the rules themselves. The node structure of the debug plan resembles that of the processing plan, but the node types more logically represent SQL operations.
User SQL statements after rewrite are converted into a cannonical plan form. The connonical plan form most closely resembles the initial SQL structure. For example, a SQL statement such as SELECT max(pm1.g1.e1) FROM pm1.g1 WHERE e2 = 1 creates a logical plan:
Project(groups=[anon_grp0], props={PROJECT_COLS=[anon_grp0.agg0 AS expr1]}) Group(groups=[anon_grp0], props={SYMBOL_MAP={anon_grp0.agg0=MAX(pm1.g1.e1)}}) Select(groups=[pm1.g1], props={SELECT_CRITERIA=e2 = 1}) Source(groups=[pm1.g1])
Here the Source corresponds to the FROM clause, the Select corresponds to the WHERE clause, the Group corresponds to the implied grouping to create the max aggregate, and the Project corresponds to the SELECT clause.
Note that the affect of grouping generates what is effectively an inline view, anon_grp0, to handle the projection of values created by the grouping.
Each node has a set of applicable properties that are typically shown on the node.
Plan rule manipulate the plan tree, fire other rules, and drive the optimization process. The structure of the query determines the initial set of rules. Each rule is designed to perform a narrow set of tasks. Some rules can be run multiple times. Some rules require a specific set of precursors to run properly.
As each relational sub plan is optimized, the plan will show what is being optimized and it's canonical form:
OPTIMIZE: SELECT e1 FROM (SELECT e1 FROM pm1.g1) AS x ---------------------------------------------------------------------------- GENERATE CANONICAL: SELECT e1 FROM (SELECT e1 FROM pm1.g1) AS x CANONICAL PLAN: Project(groups=[x], props={PROJECT_COLS=[e1]}) Source(groups=[x], props={NESTED_COMMAND=SELECT e1 FROM pm1.g1, SYMBOL_MAP={x.e1=e1}}) Project(groups=[pm1.g1], props={PROJECT_COLS=[e1]}) Source(groups=[pm1.g1])
With more complicated user queries, such as a procedure invocation or one containing subqueries, the sub plans may be nested within the overall plan. Each plan ends by showing the final processing plan:
---------------------------------------------------------------------------- OPTIMIZATION COMPLETE: PROCESSOR PLAN: AccessNode(0) output=[e1] SELECT g_0.e1 FROM pm1.g1 AS g_0
The affect of rules can be seen by the state of the plan tree before and after the rule fires. For example, the debug log below shows the application of rule merge virtual, which will remove the "x" inline view layer:
EXECUTING AssignOutputElements AFTER: Project(groups=[x], props={PROJECT_COLS=[e1], OUTPUT_COLS=[e1]}) Source(groups=[x], props={NESTED_COMMAND=SELECT e1 FROM pm1.g1, SYMBOL_MAP={x.e1=e1}, OUTPUT_COLS=[e1]}) Project(groups=[pm1.g1], props={PROJECT_COLS=[e1], OUTPUT_COLS=[e1]}) Access(groups=[pm1.g1], props={SOURCE_HINT=null, MODEL_ID=Schema name=pm1, nameInSource=null, uuid=3335, OUTPUT_COLS=[e1]}) Source(groups=[pm1.g1], props={OUTPUT_COLS=[e1]}) ============================================================================ EXECUTING MergeVirtual AFTER: Project(groups=[pm1.g1], props={PROJECT_COLS=[e1], OUTPUT_COLS=[e1]}) Access(groups=[pm1.g1], props={SOURCE_HINT=null, MODEL_ID=Schema name=pm1, nameInSource=null, uuid=3335, OUTPUT_COLS=[e1]}) Source(groups=[pm1.g1])
Some important planning decisions are shown in the plan as they occur as an annotation. For example the snippet below shows that the access node could not be raised as the parent select node contained an unsupported subquery.
Project(groups=[pm1.g1], props={PROJECT_COLS=[e1], OUTPUT_COLS=null}) Select(groups=[pm1.g1], props={SELECT_CRITERIA=e1 IN /*+ NO_UNNEST */ (SELECT e1 FROM pm2.g1), OUTPUT_COLS=null}) Access(groups=[pm1.g1], props={SOURCE_HINT=null, MODEL_ID=Schema name=pm1, nameInSource=null, uuid=3341, OUTPUT_COLS=null}) Source(groups=[pm1.g1], props={OUTPUT_COLS=null}) ============================================================================ EXECUTING RaiseAccess LOW Relational Planner SubqueryIn is not supported by source pm1 - e1 IN /*+ NO_UNNEST */ (SELECT e1 FROM pm2.g1) was not pushed AFTER: Project(groups=[pm1.g1]) Select(groups=[pm1.g1], props={SELECT_CRITERIA=e1 IN /*+ NO_UNNEST */ (SELECT e1 FROM pm2.g1), OUTPUT_COLS=null}) Access(groups=[pm1.g1], props={SOURCE_HINT=null, MODEL_ID=Schema name=pm1, nameInSource=null, uuid=3341, OUTPUT_COLS=null}) Source(groups=[pm1.g1])
XQuery is eligible for specific optimizations. Document projection is the most common optimization. It will be shown in the debug plan as an annotation. For example with the user query containing "xmltable('/a/b' passing doc columns x string path '@x', val string path '/.')", the debug plan would show a tree of the document that will effectively be used by the context and path XQuerys:
MEDIUM XQuery Planning Projection conditions met for /a/b - Document projection will be used childelement(Q{}a) childelement(Q{}b) attributeattribute(Q{}x) childtext() childtext()