Getting a Query Plan

    Query Plans
    Page edited by             Steven Hawkins

                         Changes (3)

            ...

            h1. Debug Plans 

            h2. All Node Properties 

            A relational processing plan is created by the optimizer after the logical plan is manipulated by a series of rules.  The application of rules is determined both by the query structure and by the rules themselves.  The node structure of the debug plan resembles that of the processing plan, but the node types more logically represent SQL operations. 

            h2. All Nodes 

* ACCESS - a source access or plan execution. 
* DUP_REMOVE - removes duplicate rows 
* JOIN - a join (LEFT OUTER, FULL OUTER, INNER, CROSS, SEMI, etc.) 
* PROJECT - a projection of tuple values 
* SELECT - a filtering of tuples 
* SORT - an ordering operation, which may be inserted to process other operations such as joins 
* SOURCE - any logical source of tuples including an inline view, a source access, XMLTABLE, etc. 
* GROUP - a grouping operation 
* SET_OP - a set operation (UNION/INTERSECT/EXCEPT) 
* NULL - a source of no tuples 
* TUPLE_LIMIT - row offset / limit 

User SQL statements after rewrite are converted into a cannonical plan form.  The connonical plan form most closely resembles the initial SQL structure.  For exampl, a SQL statement such as SELECT max(col) FROM tbl WHERE x = 1 creates a logical plan: 

{code} 
Project(groups=[anon_grp0], props={PROJECT_COLS=[anon_grp0.agg0 AS expr1]}) 
  Group(groups=[anon_grp0], props={SYMBOL_MAP={anon_grp0.agg0=MAX(pm1.g1.e1)}}) 
    Select(groups=[pm1.g1], props={SELECT_CRITERIA=e2 = 1}) 
      Source(groups=[pm1.g1]) 
{code} 

h2. Node Properties 

ATOMIC_REQUEST,      // Command 
        MODEL_ID,            // Object (model ID) 
        PROCEDURE_CRITERIA, 
        PROCEDURE_INPUTS, 
        PROCEDURE_DEFAULTS, 
        IS_MULTI_SOURCE, 
        SOURCE_NAME, 
        CONFORMED_SOURCES, //Set <model id> 

        // Set operation properties  
        SET_OPERATION,      // SetOperation 
        USE_ALL,            // Boolean 

        // Join node properties 
        JOIN_CRITERIA,      // List <CompareCriteria> 
        JOIN_TYPE,          // JoinType 
        JOIN_STRATEGY,      // JoinStrategyType 
        LEFT_EXPRESSIONS,   // List <SingleElementSymbol>  
        RIGHT_EXPRESSIONS,  // List <SingleElementSymbol>  
        DEPENDENT_VALUE_SOURCE, // String 
        NON_EQUI_JOIN_CRITERIA,      // List <CompareCriteria> 
        SORT_LEFT,  // SortOption 
        SORT_RIGHT,     // SortOption 
        IS_OPTIONAL,          // Boolean 
        IS_LEFT_DISTINCT, 	// Boolean 
        IS_RIGHT_DISTINCT, 	// Boolean 
        IS_SEMI_DEP,		// Boolean 
        PRESERVE, 

        // Project node properties 
        PROJECT_COLS,       // List <SingleElementSymbol> 
        INTO_GROUP,         // GroupSymbol 
        HAS_WINDOW_FUNCTIONS,		// Boolean 

        // Select node properties 
        SELECT_CRITERIA,    // Criteria 
        IS_HAVING,          // Boolean 
        //phantom nodes represent the previous position of criteria that has been pushed across a source, group, or union node. 
        //phantom nodes are used by RuleCopyCriteria and removed by RuleCleanCriteria. 
        IS_PHANTOM,         // Boolean 
        IS_TEMPORARY,         // Boolean 
        IS_COPIED,           // Boolean - used in CopyCriteria to mark which selects have already been copied 
        IS_PUSHED,           // true if this node has already been pushed 
        IS_DEPENDENT_SET, // Boolean - only used with dependent joins 

        // Sort node properties 
        SORT_ORDER,         // OrderBy 
        UNRELATED_SORT,     // Boolean 
        IS_DUP_REMOVAL,		// Boolean 

        // Source node properties 
        SYMBOL_MAP,         // SymbolMap 
        PARTITION_INFO,		// Map<ElementSymbol, List<Set<Constant>>> - it will only be consistent in the initial stages of planning 
        VIRTUAL_COMMAND,    // Command 
        MAKE_DEP,           // Option.Makedep 
        PROCESSOR_PLAN,     // ProcessorPlan for non-relational sub plan 
        NESTED_COMMAND,     // Command for nested processor plan 
        TABLE_FUNCTION,     // Table Function 
        CORRELATED_REFERENCES,  // SymbolMap 
        MAKE_NOT_DEP,       // Boolean 
        INLINE_VIEW,        // If the source node represents an inline view 
        NO_UNNEST, 
        MAKE_IND, 

        // Group node properties 
        GROUP_COLS,         // List <Expression> 
        ROLLUP,             // Boolean 

        // Special constant used in converting plan to process for all nodes 
        OUTPUT_COLS,        // List <SingleElementSymbol> 

        // Plan Node Cost Estimate Constants 
        EST_SET_SIZE,        // Integer represents the estimated set size this node would produce for a sibling node as the independent node in a dependent join scenario 
        EST_DEP_CARDINALITY, // Float value that represents the estimated cardinality (amount of rows) produced by this node as the dependent node in a dependent join scenario 
        EST_DEP_JOIN_COST,   // Float value that represents the estimated cost of a dependent join (the join strategy for this could be Nested Loop or Merge) 
        EST_JOIN_COST,       // Float value that represents the estimated cost of a merge join (the join strategy for this could be Nested Loop or Merge) 
        EST_CARDINALITY,     // Float represents the estimated cardinality (amount of rows) produced by this node 
        EST_COL_STATS, 
        EST_SELECTIVITY,     // Float that represents the selectivity of a criteria node 

        // Tuple limit and offset 
        MAX_TUPLE_LIMIT,     // Expression that evaluates to the max number of tuples generated 
        OFFSET_TUPLE_COUNT,  // Expression that evaluates to the tuple offset of the starting tuple 
        IS_IMPLICIT_LIMIT,   // Boolean if the limit is created by the rewriter as part of a subquery optimization 
        IS_NON_STRICT,		 // Boolean if the unordered limit should not be enforced strictly 

        // Common AP Information 
        ACCESS_PATTERNS,     // Collection <List <Object element ID> > 
        ACCESS_PATTERN_USED, // List <Object element ID> 
        REQUIRED_ACCESS_PATTERN_GROUPS,  

        CONSTRAINT, 
        SOURCE_HINT, 
        SUB_PLAN, 
        SUB_PLANS, 

h2. Rules 

Access Pattern Validation 
Apply Security 
Assign Output ElementSymbol 
Calculate Cost 
Choose Dependent 
Choose Join Stategy 
Clean Criteria 
Collapse Source 
Copy Criteria 
Decompose Join 
Implement Join Strategy 
Merge Criteria 
Merge Virtual 
Place Access 
Plan Joins 
Plan Procedures 
Plan Sorts 
Plan Unions 
Plan Aggregates 
Push Limit 
Push Non-Join Criteria 
Push Select Criteria 
Raise Access 
Raise Null 
Remove Optional Joins 
Substitute Expressions 
Validate Where All 

            h2. Reading a Debug Plan 

            ...

                            Full Content

        When integrating information using a federated query planner it is useful to view the query plans to better understand how information is being accessed and processed, and to troubleshoot problems.

A query plan is a set of instructions created by a query engine for executing a command submitted by a user or application. The purpose of the query plan is to execute the user's query in as efficient a way as possible.

Getting a Query Plan

You can get a query plan any time you execute a command. The SQL options available are as follows:

SET SHOWPLAN [ON|DEBUG]- Returns the processing plan or the plan and the full planner debug log.  See also the SET Statement.

With the above options, the query plan is available from the Statement object by casting to the org.teiid.jdbc.TeiidStatement interface or by using the "SHOW PLAN" statement.

Retrieving a Query Plan

statement.execute("set showplan on");
ResultSet rs = statement.executeQuery("select ...");
TeiidStatement tstatement = statement.unwrap(TeiidStatement.class);
PlanNode queryPlan = tstatement.getPlanDescription();
System.out.println(queryPlan);

The query plan is made available automatically in several of Teiid's tools.

Analyzing a Query Plan

Once a query plan has been obtained you will most commonly be looking for:

	Source pushdown -- what parts of the query that got pushed to each source

		Ensure that any predicates especially against indexes are pushed

	Joins - as federated joins can be quite expensive

		Join ordering - typically influenced by costing
		Join criteria type mismatches.
		Join algorithm used - merge, enhanced merge, nested loop, etc.

	Presence of federated optimizations, such as dependent joins.

	Ensure hints have the desired affects - see Hints and Options, hints in the FROM Clause, Subquery Optimization, and source hints below.

All of the above information can be determined from the processing plan. You will typically be interested in analyzing the textual form of the final processing plan.  To understand why particular decisions are made for debugging or support you will want to obtain the full debug log which will contain the intermediate planning steps as well as annotations as to why specific pushdown decisions are made.  

A query plan consists of a set of nodes organized in a tree structure. If you are executing a procedure or generating an XML document from an XML Document Model, the overall query plan will contain additional information related the surrounding procedural execution.

In a procedural context the ordering of child nodes implies the order of execution. In most other situation, child nodes may be executed in any order even in parallel. Only in specific optimizations, such as dependent join, will the children of a join execute serially.

Relational Execution Plans

Relational plans represent the processing plan that is composed of nodes representing building blocks of logical relational operations. Relational processing plans differ from logical debug relational plans in that they will contain additional operations and execution specifics that were chosen by the optimizer.

The nodes for a relational query plan are:

	Access - Access a source. A source query is sent to the connection factory associated with the source. [For a dependent join, this node is called Dependent Access.]

	Dependent Procedure Access - Access a stored procedure on a source using multiple sets of input values.

	Batched Update - Processes a set of updates as a batch.

	Project - Defines the columns returned from the node. This does not alter the number of records returned.

	Project Into - Like a normal project, but outputs rows into a target table.

	Insert Plan Execution - Similar to a project into, but executes a plan rather than a source query.  Typically created when executing an insert into view with a query expression.

	Window Function Project - Like a normal project, but includes window functions.

	Select - Select is a criteria evaluation filter node (WHERE / HAVING).

	Join - Defines the join type, join criteria, and join strategy (merge or nested loop).

	Union All - There are no properties for this node, it just passes rows through from it's children.  Depending upon other factors, such as if there is a transaction or the source query concurrency allowed, not all of the union children will execute in parallel.

	Sort - Defines the columns to sort on, the sort direction for each column, and whether to remove duplicates or not.

	Dup Remove - Removes duplicate rows.  The processing uses a tree structure to detect duplicates so that results will effectively stream at the cost of IO operations.

	Grouping - Groups sets of rows into groups and evaluates aggregate functions.

	Null - A node that produces no rows. Usually replaces a Select node where the criteria is always false (and whatever tree is underneath). There are no properties for this node.

	Plan Execution - Executes another sub plan. Typically the sub plan will be a non-relational plan.

	Dependent Procedure Execution - Executes a sub plan using multiple sets of input values.

	Limit - Returns a specified number of rows, then stops processing. Also processes an offset if present.

	XML Table - Evaluates XMLTABLE. The debug plan will contain more information about the XQuery/XPath with regards to their optimization - see the XQuery section below or XQuery Optimization.

	Text Table - Evaluates TEXTTABLE

	Array Table - Evaluates ARRAYTABLE

	Object Table - Evaluates OBJECTTABLE

Node Statistics

Every node has a set of statistics that are output. These can be used to determine the amount of data flowing through the node.

 Statistic 
 Description 
 Units 

 Node Output Rows 
 Number of records output from the node 
 count 

 Node Process Time 
 Time processing in this node only 
 millisec 

 Node Cumulative Process Time 
 Elapsed time from beginning of processing to end 
 millisec 

 Node Cumulative Next Batch Process Time 
 Time processing in this node + child nodes 
 millisec 

 Node Next Batch Calls 
 Number of times a node was called for processing 
 count 

 Node Blocks 
 Number of times a blocked exception was thrown by this node or a child 
 count 

In addition to node statistics, some nodes display cost estimates computed at the node.

 Cost Estimates 
 Description 
 Units 

 Estimated Node Cardinality 
 Estimated number of records that will be output from the node; -1 if unknown 
 count 

The root node will display additional information.

 Top level Statistics 
 Description 
 Units 

 Data Bytes Sent 
 The size of the serialized data result (row and lob values) sent to the client 
 bytes 

Reading a Processor Plan

Source Hints

Teiid user and transformation queries can contain a meta source hint that can provide additional information to source queries. The source hint has the form:

/*+ sh[[ KEEP ALIASES]:'arg'] source-name[ KEEP ALIASES]:'arg1' ... */

	The source hint is expected to appear after the query (SELECT, INSERT, UPDATE, DELETE) keyword.
	Source hints may appear in any subquery or in views.  All hints applicable to a given source query will be collected and pushed down together as a list.  The order of the hints is not guaranteed.
	The sh arg is optional and is passed to all source queries via the ExecutionContext.getGeneralHints method.  The additional args should have a source-name that matches the source name assigned to the translator in the VDB.  If the source-name matches, the hint values will be supplied via the ExecutionContext.getSourceHints method.  See the Developer's Guide for more on using an ExecutionContext.
	Each of the arg values has the form of a string literal - it must be surrounded in single quotes and a single quote can be escaped with another single quote. Only the Oracle translator does anything with source hints by default.  The Oracle translator will use both the source hint and the general hint (in that order) if available to form an Oracle hint enclosed in /*+ ... */.
	If the KEEP ALIASES option is used either for the general hint or on the applicable source specific hint, then the table/view aliases from the user query and any nested views will be preserved in the push-down query.  This is useful in situations where the source hint may need to reference aliases and the user does not wish to rely on the generated aliases (which can be seen in the query plan in the relevant source queries - see above).  However in some situations this may result in an invalid source query if the preserved alias names are not valid for the source or result in a name collision.  If the usage of KEEP ALIASES results in an error, the query could be modified by preventing view removal with the NO_UNNEST hint, the aliases modified, or the KEEP ALIASES option could be removed and the query plan used to determine the generated alias names.

Sample Source Hints

SELECT /*+ sh:'general hint' */ ... 

SELECT /*+ sh KEEP ALIASES:'general hint' my-oracle:'oracle hint' */ ... 

Debug Plans

A relational processing plan is created by the optimizer after the logical plan is manipulated by a series of rules.  The application of rules is determined both by the query structure and by the rules themselves.  The node structure of the debug plan resembles that of the processing plan, but the node types more logically represent SQL operations.

All Nodes

	ACCESS - a source access or plan execution.
	DUP_REMOVE - removes duplicate rows
	JOIN - a join (LEFT OUTER, FULL OUTER, INNER, CROSS, SEMI, etc.)
	PROJECT - a projection of tuple values
	SELECT - a filtering of tuples
	SORT - an ordering operation, which may be inserted to process other operations such as joins
	SOURCE - any logical source of tuples including an inline view, a source access, XMLTABLE, etc.
	GROUP - a grouping operation
	SET_OP - a set operation (UNION/INTERSECT/EXCEPT)
	NULL - a source of no tuples
	TUPLE_LIMIT - row offset / limit

User SQL statements after rewrite are converted into a cannonical plan form.  The connonical plan form most closely resembles the initial SQL structure.  For exampl, a SQL statement such as SELECT max(col) FROM tbl WHERE x = 1 creates a logical plan:

Project(groups=[anon_grp0], props={PROJECT_COLS=[anon_grp0.agg0 AS expr1]})
  Group(groups=[anon_grp0], props={SYMBOL_MAP={anon_grp0.agg0=MAX(pm1.g1.e1)}})
    Select(groups=[pm1.g1], props={SELECT_CRITERIA=e2 = 1})
      Source(groups=[pm1.g1])

Node Properties

ATOMIC_REQUEST,      // Command

        MODEL_ID,            // Object (model ID)

        PROCEDURE_CRITERIA,

        PROCEDURE_INPUTS,

        PROCEDURE_DEFAULTS,

        IS_MULTI_SOURCE,

        SOURCE_NAME,

        CONFORMED_SOURCES, //Set <model id>

        // Set operation properties 

        SET_OPERATION,      // SetOperation

        USE_ALL,            // Boolean

        // Join node properties

        JOIN_CRITERIA,      // List <CompareCriteria>

        JOIN_TYPE,          // JoinType

        JOIN_STRATEGY,      // JoinStrategyType

        LEFT_EXPRESSIONS,   // List <SingleElementSymbol> 

        RIGHT_EXPRESSIONS,  // List <SingleElementSymbol> 

        DEPENDENT_VALUE_SOURCE, // String

        NON_EQUI_JOIN_CRITERIA,      // List <CompareCriteria>

        SORT_LEFT,  // SortOption

        SORT_RIGHT,     // SortOption

        IS_OPTIONAL,          // Boolean

        IS_LEFT_DISTINCT, 	// Boolean

        IS_RIGHT_DISTINCT, 	// Boolean

        IS_SEMI_DEP,		// Boolean

        PRESERVE,

        // Project node properties

        PROJECT_COLS,       // List <SingleElementSymbol>

        INTO_GROUP,         // GroupSymbol

        HAS_WINDOW_FUNCTIONS,		// Boolean

        // Select node properties

        SELECT_CRITERIA,    // Criteria

        IS_HAVING,          // Boolean

        //phantom nodes represent the previous position of criteria that has been pushed across a source, group, or union node.

        //phantom nodes are used by RuleCopyCriteria and removed by RuleCleanCriteria.

        IS_PHANTOM,         // Boolean

        IS_TEMPORARY,         // Boolean

        IS_COPIED,           // Boolean - used in CopyCriteria to mark which selects have already been copied

        IS_PUSHED,           // true if this node has already been pushed

        IS_DEPENDENT_SET, // Boolean - only used with dependent joins

        // Sort node properties

        SORT_ORDER,         // OrderBy

        UNRELATED_SORT,     // Boolean

        IS_DUP_REMOVAL,		// Boolean

        // Source node properties

        SYMBOL_MAP,         // SymbolMap

        PARTITION_INFO,		// Map<ElementSymbol, List<Set<Constant>>> - it will only be consistent in the initial stages of planning

        VIRTUAL_COMMAND,    // Command

        MAKE_DEP,           // Option.Makedep

        PROCESSOR_PLAN,     // ProcessorPlan for non-relational sub plan

        NESTED_COMMAND,     // Command for nested processor plan

        TABLE_FUNCTION,     // Table Function

        CORRELATED_REFERENCES,  // SymbolMap

        MAKE_NOT_DEP,       // Boolean

        INLINE_VIEW,        // If the source node represents an inline view

        NO_UNNEST,

        MAKE_IND,

        // Group node properties

        GROUP_COLS,         // List <Expression>

        ROLLUP,             // Boolean

        // Special constant used in converting plan to process for all nodes

        OUTPUT_COLS,        // List <SingleElementSymbol>

        // Plan Node Cost Estimate Constants

        EST_SET_SIZE,        // Integer represents the estimated set size this node would produce for a sibling node as the independent node in a dependent join scenario

        EST_DEP_CARDINALITY, // Float value that represents the estimated cardinality (amount of rows) produced by this node as the dependent node in a dependent join scenario

        EST_DEP_JOIN_COST,   // Float value that represents the estimated cost of a dependent join (the join strategy for this could be Nested Loop or Merge)

        EST_JOIN_COST,       // Float value that represents the estimated cost of a merge join (the join strategy for this could be Nested Loop or Merge)

        EST_CARDINALITY,     // Float represents the estimated cardinality (amount of rows) produced by this node

        EST_COL_STATS,

        EST_SELECTIVITY,     // Float that represents the selectivity of a criteria node

        // Tuple limit and offset

        MAX_TUPLE_LIMIT,     // Expression that evaluates to the max number of tuples generated

        OFFSET_TUPLE_COUNT,  // Expression that evaluates to the tuple offset of the starting tuple

        IS_IMPLICIT_LIMIT,   // Boolean if the limit is created by the rewriter as part of a subquery optimization

        IS_NON_STRICT,		 // Boolean if the unordered limit should not be enforced strictly

        // Common AP Information

        ACCESS_PATTERNS,     // Collection <List <Object element ID> >

        ACCESS_PATTERN_USED, // List <Object element ID>

        REQUIRED_ACCESS_PATTERN_GROUPS, 

        CONSTRAINT,

        SOURCE_HINT,

        SUB_PLAN,

        SUB_PLANS,

Rules

Access Pattern Validation

Apply Security

Assign Output ElementSymbol

Calculate Cost

Choose Dependent

Choose Join Stategy

Clean Criteria

Collapse Source

Copy Criteria

Decompose Join

Implement Join Strategy

Merge Criteria

Merge Virtual

Place Access

Plan Joins

Plan Procedures

Plan Sorts

Plan Unions

Plan Aggregates

Push Limit

Push Non-Join Criteria

Push Select Criteria

Raise Access

Raise Null

Remove Optional Joins

Substitute Expressions

Validate Where All

Reading a Debug Plan

XQuery

                        Stop watching space
            |
                Change email notification preferences

        View Online
        |
        View Changes
                |
        Add Comment

Statistic	Description	Units
Node Output Rows	Number of records output from the node	count
Node Process Time	Time processing in this node only	millisec
Node Cumulative Process Time	Elapsed time from beginning of processing to end	millisec
Node Cumulative Next Batch Process Time	Time processing in this node + child nodes	millisec
Node Next Batch Calls	Number of times a node was called for processing	count
Node Blocks	Number of times a blocked exception was thrown by this node or a child	count