[teiid-commits] teiid SVN: r3059 - in trunk: engine/src/main/java/org/teiid/common/buffer and 7 other directories.

teiid-commits at lists.jboss.org teiid-commits at lists.jboss.org
Mon Apr 4 18:35:34 EDT 2011


Author: shawkins
Date: 2011-04-04 18:35:34 -0400 (Mon, 04 Apr 2011)
New Revision: 3059

Modified:
   trunk/build/kits/jboss-container/teiid-releasenotes.html
   trunk/engine/src/main/java/org/teiid/common/buffer/STree.java
   trunk/engine/src/main/java/org/teiid/query/optimizer/relational/PlanToProcessConverter.java
   trunk/engine/src/main/java/org/teiid/query/optimizer/relational/rules/CapabilitiesUtil.java
   trunk/engine/src/main/java/org/teiid/query/optimizer/relational/rules/JoinRegion.java
   trunk/engine/src/main/java/org/teiid/query/optimizer/relational/rules/NewCalculateCostUtil.java
   trunk/engine/src/main/java/org/teiid/query/optimizer/relational/rules/RuleChooseDependent.java
   trunk/engine/src/main/java/org/teiid/query/processor/relational/AccessNode.java
   trunk/engine/src/main/java/org/teiid/query/processor/relational/DependentCriteriaProcessor.java
   trunk/engine/src/main/java/org/teiid/query/processor/relational/EnhancedSortMergeJoinStrategy.java
   trunk/engine/src/main/java/org/teiid/query/sql/lang/DependentSetCriteria.java
   trunk/engine/src/test/java/org/teiid/common/buffer/TestSTree.java
   trunk/engine/src/test/java/org/teiid/query/processor/TestDependentJoins.java
   trunk/engine/src/test/java/org/teiid/query/processor/TestProcessor.java
   trunk/engine/src/test/java/org/teiid/query/processor/relational/TestAccessNode.java
Log:
TEIID-1533 adding cost based back off of dependent joins

Modified: trunk/build/kits/jboss-container/teiid-releasenotes.html
===================================================================
--- trunk/build/kits/jboss-container/teiid-releasenotes.html	2011-04-04 22:34:47 UTC (rev 3058)
+++ trunk/build/kits/jboss-container/teiid-releasenotes.html	2011-04-04 22:35:34 UTC (rev 3059)
@@ -41,13 +41,14 @@
 	<LI><B>InterSystems Cache</B> - InterSystems Cache database translator is now available to use as supported source under Teiid.
 	<LI><B>userRequestSourceConcurrency</B> - was added to control the number of concurrent source queries allowed for each user request.
 	<LI><B>Memory Management Improvements</B> - maxReserveBatchColumns and maxProcessingBatchesColumns will be default be determined automatically and will more reliably prevent memory issues.  See the admin guide for more.
-	<LI><B>Subquery optimization control</B> - added the MJ and NO_UNNEST hints and the org.teiid.subqueryUnnestDefault system property to control the optimization of subqueries to traditional joins or to a merge join implemenation of a semijoin or antijoin.
+	<LI><B>Subquery optimization control</B> - added the MJ and NO_UNNEST hints and the org.teiid.subqueryUnnestDefault system property to control the optimization of subqueries to traditional joins or to a merge join implementation of a semijoin or antijoin.
 	<LI><B>Local connection threads</B> - local connection calling threads will be used to process work rather than using an engine thread.  This helps decouple the configuration of maxThreads.
 	<LI><B>Dependent Join Improvements</B> - several major improvements were made to increase performance and develop better plans.
 		<UL>
 			<LI><B>Improved Planning</B> - the decision to create a dependent join is now considered earlier in planning and is much more effective for dependent joins involving multiple unrelated independent tables. 
 			<LI><B>IN predicate splitting</B> - the planner can now split large dependent IN predicates into multiple IN predicates, which is controlled by the translator property MaxDepdendentInPredicates.  This allows for much larger dependent joins to be performed as a single query.
-			<LI><B>Dependent query parallization</B> - when multiple dependent queries are still required, then they will be run in parallel (up to MaxUserSourceRequestConcurrency), rather than sequentially. 
+			<LI><B>Dependent query parallization</B> - when multiple dependent queries are still required, then they will be run in parallel (up to MaxUserSourceRequestConcurrency), rather than sequentially.
+			<LI><B>Cost based back-off</B> - for cost based dependent joins if the number of independent values is too large, then the join will be performed as normal. 
 		</UL>
 	<LI><B>Enhanced Sort Join</B> - the partitioned merge join was replaced with an enhanced sort join.  The enhanced sort join will use the actual row counts from each side of the relation to perform a index based join if one side is small enough, a partial sort of the larger side and a repeated merge join if the tuples are unbalanced but one side is not small enough to form an index, or a standard sort merge join if the tuples are balanced.   
 </UL>

Modified: trunk/engine/src/main/java/org/teiid/common/buffer/STree.java
===================================================================
--- trunk/engine/src/main/java/org/teiid/common/buffer/STree.java	2011-04-04 22:34:47 UTC (rev 3058)
+++ trunk/engine/src/main/java/org/teiid/common/buffer/STree.java	2011-04-04 22:35:34 UTC (rev 3059)
@@ -195,8 +195,12 @@
 		}
 		List key = extractKey(tuple);
 		int level = 0;
-		if (mode != InsertMode.ORDERED || sizeHint == -1) {
-			level = randomLevel(); 
+		if (mode != InsertMode.ORDERED) {
+			if (sizeHint > -1) {
+				level = Math.min(sizeHint, randomLevel());
+			} else {
+				level = randomLevel();
+			}
 		} else if (!places.isEmpty() && places.getLast().values.getTuples().size() == pageSize) {
 			int row = rowCount.get();
 			while (row != 0 && row%pageSize == 0) {
@@ -228,6 +232,18 @@
 		return null;
 	}
 	
+	public int getExpectedHeight(int sizeHint) {
+		if (sizeHint == 0) {
+			return 0;
+		}
+		int logSize = 1;
+		while (sizeHint > this.pageSize) {
+			logSize++;
+			sizeHint/=this.pageSize;
+		}
+		return logSize;
+	}
+
 	List extractKey(List tuple) {
 		if (tuple.size() > keyLength) {
 			return new ArrayList(tuple.subList(0, keyLength));

Modified: trunk/engine/src/main/java/org/teiid/query/optimizer/relational/PlanToProcessConverter.java
===================================================================
--- trunk/engine/src/main/java/org/teiid/query/optimizer/relational/PlanToProcessConverter.java	2011-04-04 22:34:47 UTC (rev 3058)
+++ trunk/engine/src/main/java/org/teiid/query/optimizer/relational/PlanToProcessConverter.java	2011-04-04 22:35:34 UTC (rev 3059)
@@ -307,7 +307,7 @@
                             
                             if(modelID != null){
                                 depAccessNode.setMaxSetSize(CapabilitiesUtil.getMaxInCriteriaSize(modelID, metadata, capFinder));
-                                depAccessNode.setMaxPredicates(CapabilitiesUtil.getMaxDependentPredicatesSize(modelID, metadata, capFinder));   
+                                depAccessNode.setMaxPredicates(CapabilitiesUtil.getMaxDependentPredicates(modelID, metadata, capFinder));   
                             }
                             processNode = depAccessNode;
                             aNode = depAccessNode;

Modified: trunk/engine/src/main/java/org/teiid/query/optimizer/relational/rules/CapabilitiesUtil.java
===================================================================
--- trunk/engine/src/main/java/org/teiid/query/optimizer/relational/rules/CapabilitiesUtil.java	2011-04-04 22:34:47 UTC (rev 3058)
+++ trunk/engine/src/main/java/org/teiid/query/optimizer/relational/rules/CapabilitiesUtil.java	2011-04-04 22:35:34 UTC (rev 3059)
@@ -321,7 +321,7 @@
         return value;
     }
     
-    public static int getMaxDependentPredicatesSize(Object modelID, QueryMetadataInterface metadata, CapabilitiesFinder capFinder) 
+    public static int getMaxDependentPredicates(Object modelID, QueryMetadataInterface metadata, CapabilitiesFinder capFinder) 
     throws QueryMetadataException, TeiidComponentException {
     	return getProperty(Capability.MAX_DEPENDENT_PREDICATES, modelID, metadata, capFinder);
     }

Modified: trunk/engine/src/main/java/org/teiid/query/optimizer/relational/rules/JoinRegion.java
===================================================================
--- trunk/engine/src/main/java/org/teiid/query/optimizer/relational/rules/JoinRegion.java	2011-04-04 22:34:47 UTC (rev 3058)
+++ trunk/engine/src/main/java/org/teiid/query/optimizer/relational/rules/JoinRegion.java	2011-04-04 22:35:34 UTC (rev 3059)
@@ -365,7 +365,7 @@
 		if (leftExpressions.isEmpty()) {
 			return null;
 		}
-		return NewCalculateCostUtil.computeCostForDepJoin(indNode, depNode, leftExpressions, rightExpressions, metadata, capFinder, context);
+		return NewCalculateCostUtil.computeCostForDepJoin(indNode, depNode, leftExpressions, rightExpressions, metadata, capFinder, context).expectedCardinality;
 	}
     
     /**

Modified: trunk/engine/src/main/java/org/teiid/query/optimizer/relational/rules/NewCalculateCostUtil.java
===================================================================
--- trunk/engine/src/main/java/org/teiid/query/optimizer/relational/rules/NewCalculateCostUtil.java	2011-04-04 22:34:47 UTC (rev 3058)
+++ trunk/engine/src/main/java/org/teiid/query/optimizer/relational/rules/NewCalculateCostUtil.java	2011-04-04 22:35:34 UTC (rev 3059)
@@ -97,6 +97,12 @@
     	NDV,
     	NNV
     }
+    
+    public static class DependentCostAnalysis {
+    	Float[] maxNdv;
+    	Float[] expectedNdv;
+    	Float expectedCardinality;
+    }
 
     @SuppressWarnings("serial")
 	private static class ColStats extends LinkedHashMap<Expression, float[]> {
@@ -1112,7 +1118,7 @@
      * @throws QueryPlannerException 
      * 
      */
-    public static Float computeCostForDepJoin(PlanNode joinNode, boolean leftIndependent, QueryMetadataInterface metadata, CapabilitiesFinder capFinder, CommandContext context) 
+    public static DependentCostAnalysis computeCostForDepJoin(PlanNode joinNode, boolean leftIndependent, QueryMetadataInterface metadata, CapabilitiesFinder capFinder, CommandContext context) 
         throws TeiidComponentException, QueryMetadataException, QueryPlannerException {
         
         PlanNode independentNode = leftIndependent?joinNode.getFirstChild():joinNode.getLastChild();
@@ -1125,8 +1131,8 @@
 				independentExpressions, dependentExpressions, metadata,
 				capFinder, context);
     }
-
-	public static Float computeCostForDepJoin(PlanNode independentNode,
+    
+	public static DependentCostAnalysis computeCostForDepJoin(PlanNode independentNode,
 			PlanNode dependentNode, List independentExpressions,
 			List dependentExpressions, QueryMetadataInterface metadata,
 			CapabilitiesFinder capFinder, CommandContext context)
@@ -1134,9 +1140,13 @@
 
         float independentCardinality = computeCostForTree(independentNode, metadata);
         float dependentCardinality = computeCostForTree(dependentNode, metadata);
-        
+
+        DependentCostAnalysis dca = new DependentCostAnalysis();
+        dca.maxNdv = new Float[independentExpressions.size()];
+        dca.expectedNdv = new Float[independentExpressions.size()];
+
         if (independentCardinality == UNKNOWN_VALUE || dependentCardinality == UNKNOWN_VALUE) {
-        	return null;
+        	return dca; //no cost information to be determined
         }
         
         float processorBatchSize = BufferManager.DEFAULT_PROCESSOR_BATCH_SIZE;
@@ -1153,56 +1163,9 @@
 			float indSymbolNDV = getNDVEstimate(independentNode, metadata, independentCardinality, indElements, true);
 			Expression depExpr = (Expression)dependentExpressions.get(i);
 			
-			LinkedList<PlanNode> critNodes = new LinkedList<PlanNode>();
-			LinkedList<PlanNode> initialTargets = new LinkedList<PlanNode>();
-			
 			LinkedList<Expression> depExpressions = new LinkedList<Expression>();
-			LinkedList<PlanNode> targets = new LinkedList<PlanNode>();
-			critNodes.add(RelationalPlanner.createSelectNode(new DependentSetCriteria(depExpr, null), false));
-			initialTargets.add(dependentNode);
-			while (!critNodes.isEmpty()) {
-				PlanNode critNode = critNodes.remove();
-				PlanNode initial = initialTargets.remove();
-				if (critNode.getGroups().isEmpty()) {
-					//TODO: we need to project constants up through a plan to avoid this case
-					continue;
-				}
-				PlanNode sourceNode = FrameUtil.findOriginatingNode(initial, critNode.getGroups());
-				PlanNode target = sourceNode;
-				if (initial != sourceNode) {
-					target = rpsc.examinePath(initial, sourceNode, metadata, capFinder);					
-				}
-				if (target != sourceNode || (sourceNode.getType() == NodeConstants.Types.SOURCE && sourceNode.getChildCount() == 0)) {
-					targets.add(target);
-					DependentSetCriteria dsc = (DependentSetCriteria)critNode.getProperty(Info.SELECT_CRITERIA);
-					depExpressions.add(dsc.getExpression());
-					continue;
-				}
-				if (sourceNode.getType() == NodeConstants.Types.SOURCE) {
-					PlanNode child = sourceNode.getFirstChild();
-		            child = FrameUtil.findOriginatingNode(child, child.getGroups());
-		            if (child != null && child.getType() == NodeConstants.Types.SET_OP) {
-		            	targets.add(target);
-						DependentSetCriteria dsc = (DependentSetCriteria)critNode.getProperty(Info.SELECT_CRITERIA);
-						depExpressions.add(dsc.getExpression());
-						//TODO: we need better handling for set op situations
-						continue;
-		            }
-					if (!rpsc.pushAcrossFrame(sourceNode, critNode, metadata)) {
-						targets.add(target);
-						DependentSetCriteria dsc = (DependentSetCriteria)critNode.getProperty(Info.SELECT_CRITERIA);
-						depExpressions.add(dsc.getExpression());
-					}
-					List<PlanNode> createdNodes = rpsc.getCreatedNodes();
-					for (PlanNode planNode : createdNodes) {
-						critNodes.add(planNode);
-						initialTargets.add(planNode.getFirstChild());
-						NodeEditor.removeChildNode(planNode.getParent(), planNode);
-					}
-					rpsc.getCreatedNodes().clear();
-				} 
-				//the source must be a null or project node, which we don't care about
-			}
+			LinkedList<PlanNode> targets = determineTargets(dependentNode,
+					metadata, capFinder, rpsc, depExpr, depExpressions);
 			
 			Iterator<Expression> exprIter = depExpressions.iterator(); 
 			for (Iterator<PlanNode> targetIter = targets.iterator(); targetIter.hasNext();) {
@@ -1216,6 +1179,11 @@
 		            setCriteriaBatchSize = CapabilitiesUtil.getMaxInCriteriaSize(RuleRaiseAccess.getModelIDFromAccess(accessNode, metadata), metadata, capFinder);
 		            if (setCriteriaBatchSize < 1) {
 		                setCriteriaBatchSize = indSymbolNDV;
+		            } else {
+		            	int numberOfSets = CapabilitiesUtil.getMaxDependentPredicates(RuleRaiseAccess.getModelIDFromAccess(accessNode, metadata), metadata, capFinder);
+		            	if (numberOfSets > 0) {
+		            		setCriteriaBatchSize *= Math.max(1, numberOfSets /dependentExpressions.size()); //scale down to be conservative 
+		            	}
 		            }
 		        } else if (indSymbolNDV > processorBatchSize) {
 	            	//don't bother making a virtual join dependent if they are likely to be large
@@ -1259,27 +1227,112 @@
 				if (!usesKey && accessNode != null && target.getType() == NodeConstants.Types.SOURCE && target.getChildCount() == 0) {
 					usesIndex = usesKey(depElems, target.getGroups(), metadata, false);
 				}
-		        float dependentAccessCardinality = Math.min(depTargetCardinality, depTargetCardinality * indSymbolNDV / depSymbolNDV);
-		        float scaledCardinality = Math.min(dependentCardinality, dependentCardinality * indSymbolNDV / depSymbolNDV);
-		        float numberComparisons = (usesIndex?safeLog(depTargetCardinality):depTargetCardinality) * (usesIndex?indSymbolNDV:safeLog(indSymbolNDV));
-		        float newDependentQueries = accessNode == null?0:(float)Math.ceil(indSymbolNDV / setCriteriaBatchSize);
-		        
-		        float relativeCost = newDependentQueries*procNewRequestTime;
-		        float relativeComparisonCost = (numberComparisons - safeLog(scaledCardinality) /*no longer needed by the join*/
-		            /*sort cost reduction, however it's always true if its on the source and using an index
-		              TODO: there are other cost reductions, which we could get by checking the other parent nodes */
-		        	+ (scaledCardinality*safeLog(scaledCardinality) - dependentCardinality*safeLog(dependentCardinality))) 
-		        	* compareTime;
-		        float relativeReadCost = (dependentAccessCardinality - depTargetCardinality)*readTime; //cardinality reductions
-		        
-		        if (relativeCost + relativeComparisonCost + relativeReadCost < 0) {
-		        	return scaledCardinality;
+		        float[] estimates = estimateCost(accessNode, setCriteriaBatchSize, usesIndex, depTargetCardinality, indSymbolNDV, dependentCardinality, depSymbolNDV);
+		        if (estimates[1] < 0) {
+		        	if (dca.expectedCardinality == null) {
+		        		dca.expectedCardinality = estimates[0];
+		        	} else {
+		        		dca.expectedCardinality = Math.min(dca.expectedCardinality, estimates[0]);
+		        	}
 		        }
+		        dca.expectedNdv[i] = indSymbolNDV;
+		        //use a quick binary search to find the max ndv
+		        float min = 0;
+		        float max = Math.max(Integer.MAX_VALUE, indSymbolNDV);
+		        for (int j = 0; j < 10; j++) {
+		        	if (estimates[1] > 1) {
+		        		max = indSymbolNDV;
+		        		indSymbolNDV = (indSymbolNDV + min)/2;
+		        	} else if (estimates[1] < 0) {
+		        		min = indSymbolNDV;
+		        		//we assume that values should be closer to the min side
+		        		indSymbolNDV = Math.min(indSymbolNDV * 8 + 1, (indSymbolNDV + max)/2);
+		        	} else {
+		        		break;
+		        	}
+		        	estimates = estimateCost(accessNode, setCriteriaBatchSize, usesIndex, depTargetCardinality, indSymbolNDV, dependentCardinality, depSymbolNDV);
+		        }
+		        dca.maxNdv[i] = indSymbolNDV;
 			}
 		}
+        return dca;
+	}
+	
+	private static float[] estimateCost(PlanNode accessNode, float setCriteriaBatchSize, boolean usesIndex, float depTargetCardinality, 
+			float indSymbolNDV, float dependentCardinality, float depSymbolNDV) {
+        float dependentAccessCardinality = Math.min(depTargetCardinality, depTargetCardinality * indSymbolNDV / depSymbolNDV);
+        float scaledCardinality = Math.min(dependentCardinality, dependentCardinality * indSymbolNDV / depSymbolNDV);
+		float numberComparisons = (usesIndex?safeLog(depTargetCardinality):depTargetCardinality) * (usesIndex?indSymbolNDV:safeLog(indSymbolNDV));
+        float newDependentQueries = accessNode == null?0:(float)Math.ceil(indSymbolNDV / setCriteriaBatchSize);
         
-        return null;
+        float relativeCost = newDependentQueries*procNewRequestTime;
+        float relativeComparisonCost = (numberComparisons - safeLog(scaledCardinality) /*no longer needed by the join*/
+            /*sort cost reduction, however it's always true if its on the source and using an index
+              TODO: there are other cost reductions, which we could get by checking the other parent nodes */
+        	+ (scaledCardinality*safeLog(scaledCardinality) - dependentCardinality*safeLog(dependentCardinality))) 
+        	* compareTime;
+        float relativeReadCost = (dependentAccessCardinality - depTargetCardinality)*readTime; //cardinality reductions
+        return new float[] {scaledCardinality, relativeCost + relativeComparisonCost + relativeReadCost};
 	}
+	
+	/**
+	 * For now we only consider a single target. In the future we may consider multiple.
+	 */
+	private static LinkedList<PlanNode> determineTargets(
+			PlanNode dependentNode, QueryMetadataInterface metadata,
+			CapabilitiesFinder capFinder, RulePushSelectCriteria rpsc,
+			Expression depExpr, LinkedList<Expression> depExpressions)
+			throws QueryPlannerException, TeiidComponentException {
+		LinkedList<PlanNode> targets = new LinkedList<PlanNode>();
+		LinkedList<PlanNode> critNodes = new LinkedList<PlanNode>();
+		critNodes.add(RelationalPlanner.createSelectNode(new DependentSetCriteria(depExpr, null), false));
+		LinkedList<PlanNode> initialTargets = new LinkedList<PlanNode>();
+		initialTargets.add(dependentNode);
+		while (!critNodes.isEmpty()) {
+			PlanNode critNode = critNodes.remove();
+			PlanNode initial = initialTargets.remove();
+			if (critNode.getGroups().isEmpty()) {
+				//TODO: we need to project constants up through a plan to avoid this case
+				continue;
+			}
+			PlanNode sourceNode = FrameUtil.findOriginatingNode(initial, critNode.getGroups());
+			PlanNode target = sourceNode;
+			if (initial != sourceNode) {
+				target = rpsc.examinePath(initial, sourceNode, metadata, capFinder);					
+			}
+			if (target != sourceNode || (sourceNode.getType() == NodeConstants.Types.SOURCE && sourceNode.getChildCount() == 0)) {
+				targets.add(target);
+				DependentSetCriteria dsc = (DependentSetCriteria)critNode.getProperty(Info.SELECT_CRITERIA);
+				depExpressions.add(dsc.getExpression());
+				continue;
+			}
+			if (sourceNode.getType() == NodeConstants.Types.SOURCE) {
+				PlanNode child = sourceNode.getFirstChild();
+		        child = FrameUtil.findOriginatingNode(child, child.getGroups());
+		        if (child != null && child.getType() == NodeConstants.Types.SET_OP) {
+		        	targets.add(target);
+					DependentSetCriteria dsc = (DependentSetCriteria)critNode.getProperty(Info.SELECT_CRITERIA);
+					depExpressions.add(dsc.getExpression());
+					//TODO: we need better handling for set op situations
+					continue;
+		        }
+				if (!rpsc.pushAcrossFrame(sourceNode, critNode, metadata)) {
+					targets.add(target);
+					DependentSetCriteria dsc = (DependentSetCriteria)critNode.getProperty(Info.SELECT_CRITERIA);
+					depExpressions.add(dsc.getExpression());
+				}
+				List<PlanNode> createdNodes = rpsc.getCreatedNodes();
+				for (PlanNode planNode : createdNodes) {
+					critNodes.add(planNode);
+					initialTargets.add(planNode.getFirstChild());
+					NodeEditor.removeChildNode(planNode.getParent(), planNode);
+				}
+				rpsc.getCreatedNodes().clear();
+			} 
+			//the source must be a null or project node, which we don't care about
+		}
+		return targets;
+	}
 
 	static float getNDVEstimate(PlanNode indNode,
 			QueryMetadataInterface metadata, float cardinality,

Modified: trunk/engine/src/main/java/org/teiid/query/optimizer/relational/rules/RuleChooseDependent.java
===================================================================
--- trunk/engine/src/main/java/org/teiid/query/optimizer/relational/rules/RuleChooseDependent.java	2011-04-04 22:34:47 UTC (rev 3058)
+++ trunk/engine/src/main/java/org/teiid/query/optimizer/relational/rules/RuleChooseDependent.java	2011-04-04 22:35:34 UTC (rev 3059)
@@ -41,6 +41,7 @@
 import org.teiid.query.optimizer.relational.plantree.NodeConstants;
 import org.teiid.query.optimizer.relational.plantree.NodeEditor;
 import org.teiid.query.optimizer.relational.plantree.PlanNode;
+import org.teiid.query.optimizer.relational.rules.NewCalculateCostUtil.DependentCostAnalysis;
 import org.teiid.query.sql.lang.DependentSetCriteria;
 import org.teiid.query.sql.lang.JoinType;
 import org.teiid.query.sql.symbol.ElementSymbol;
@@ -85,31 +86,31 @@
             
             PlanNode chosenNode = chooseDepWithoutCosting(sourceNode, bothCandidates?siblingNode:null, analysisRecord);
             if(chosenNode != null) {
-                pushCriteria |= markDependent(chosenNode, joinNode, metadata);
+                pushCriteria |= markDependent(chosenNode, joinNode, metadata, null);
                 continue;
             }   
             
-            boolean useDepJoin = NewCalculateCostUtil.computeCostForDepJoin(joinNode, !entry.leftCandidate, metadata, capFinder, context) != null;
+            DependentCostAnalysis dca = NewCalculateCostUtil.computeCostForDepJoin(joinNode, !entry.leftCandidate, metadata, capFinder, context);
             PlanNode dependentNode = sourceNode;
             
-            if (bothCandidates && !useDepJoin) {
-                useDepJoin = NewCalculateCostUtil.computeCostForDepJoin(joinNode, true, metadata, capFinder, context) != null;
-                if (useDepJoin) {
+            if (bothCandidates && dca.expectedCardinality == null) {
+                dca = NewCalculateCostUtil.computeCostForDepJoin(joinNode, true, metadata, capFinder, context);
+                if (dca.expectedCardinality != null) {
                     dependentNode = siblingNode;
                 }
             }
             
-            if (useDepJoin) {
-                pushCriteria |= markDependent(dependentNode, joinNode, metadata);
+            if (dca.expectedCardinality != null) {
+                pushCriteria |= markDependent(dependentNode, joinNode, metadata, dca);
             } else {
             	float sourceCost = NewCalculateCostUtil.computeCostForTree(sourceNode, metadata);
             	float siblingCost = NewCalculateCostUtil.computeCostForTree(siblingNode, metadata);
             	
                 if (bothCandidates && sourceCost != NewCalculateCostUtil.UNKNOWN_VALUE && sourceCost < RuleChooseDependent.DEFAULT_INDEPENDENT_CARDINALITY 
                 		&& (sourceCost < siblingCost || siblingCost == NewCalculateCostUtil.UNKNOWN_VALUE)) {
-                    pushCriteria |= markDependent(siblingNode, joinNode, metadata);
+                    pushCriteria |= markDependent(siblingNode, joinNode, metadata, null);
                 } else if (siblingCost != NewCalculateCostUtil.UNKNOWN_VALUE && siblingCost < RuleChooseDependent.DEFAULT_INDEPENDENT_CARDINALITY) {
-                    pushCriteria |= markDependent(sourceNode, joinNode, metadata);
+                    pushCriteria |= markDependent(sourceNode, joinNode, metadata, null);
                 }
             }
         }
@@ -260,10 +261,11 @@
     /**
      * Mark the specified access node to be made dependent
      * @param sourceNode Node to make dependent
+     * @param dca 
      * @throws TeiidComponentException 
      * @throws QueryMetadataException 
      */
-    boolean markDependent(PlanNode sourceNode, PlanNode joinNode, QueryMetadataInterface metadata) throws QueryMetadataException, TeiidComponentException {
+    boolean markDependent(PlanNode sourceNode, PlanNode joinNode, QueryMetadataInterface metadata, DependentCostAnalysis dca) throws QueryMetadataException, TeiidComponentException {
 
         boolean isLeft = joinNode.getFirstChild() == sourceNode;
         
@@ -279,7 +281,7 @@
         // Create DependentValueSource and set on the independent side as this will feed the values
         joinNode.setProperty(NodeConstants.Info.DEPENDENT_VALUE_SOURCE, id);
 
-        List<PlanNode> crits = getDependentCriteriaNodes(id, independentExpressions, dependentExpressions, isLeft?joinNode.getLastChild():joinNode.getFirstChild(), metadata);
+        List<PlanNode> crits = getDependentCriteriaNodes(id, independentExpressions, dependentExpressions, isLeft?joinNode.getLastChild():joinNode.getFirstChild(), metadata, dca);
         
         PlanNode newRoot = sourceNode;
         
@@ -303,21 +305,30 @@
      * @since 4.3
      */
     private List<PlanNode> getDependentCriteriaNodes(String id, List independentExpressions,
-                                           List dependentExpressions, PlanNode indNode, QueryMetadataInterface metadata) throws QueryMetadataException, TeiidComponentException {
+                                           List dependentExpressions, PlanNode indNode, QueryMetadataInterface metadata, DependentCostAnalysis dca) throws QueryMetadataException, TeiidComponentException {
         
         List<PlanNode> result = new LinkedList<PlanNode>();
         
-        Iterator depIter = dependentExpressions.iterator();
-        Iterator indepIter = independentExpressions.iterator();
+        Float cardinality = null;
         
-        float cardinality = NewCalculateCostUtil.computeCostForTree(indNode, metadata);
-        
-        while(depIter.hasNext()) {
-            Expression depExpr = (Expression) depIter.next();
-            Expression indepExpr = (Expression) indepIter.next();
+        for (int i = 0; i < dependentExpressions.size(); i++) {
+            Expression depExpr = (Expression) dependentExpressions.get(i);
+            Expression indepExpr = (Expression) independentExpressions.get(i);
             DependentSetCriteria crit = new DependentSetCriteria(SymbolMap.getExpression(depExpr), id);
-            Collection<ElementSymbol> elems = ElementCollectorVisitor.getElements(indepExpr, true);
-            float ndv = NewCalculateCostUtil.getNDVEstimate(indNode, metadata, cardinality, elems, true);
+            float ndv = NewCalculateCostUtil.UNKNOWN_VALUE;
+            if (dca != null && dca.expectedNdv[i] != null) {
+            	if (dca.expectedNdv[i] > 4*dca.maxNdv[i]) {
+            		continue; //not necessary to use
+            	}
+            	ndv = dca.expectedNdv[i];
+            	crit.setMaxNdv(dca.maxNdv[i]);
+            } else { 
+	            Collection<ElementSymbol> elems = ElementCollectorVisitor.getElements(indepExpr, true);
+	            if (cardinality == null) {
+	            	cardinality = NewCalculateCostUtil.computeCostForTree(indNode, metadata);
+	            }
+	            ndv = NewCalculateCostUtil.getNDVEstimate(indNode, metadata, cardinality, elems, true);
+            }
             crit.setNdv(ndv);
             crit.setValueExpression(indepExpr);
             

Modified: trunk/engine/src/main/java/org/teiid/query/processor/relational/AccessNode.java
===================================================================
--- trunk/engine/src/main/java/org/teiid/query/processor/relational/AccessNode.java	2011-04-04 22:34:47 UTC (rev 3058)
+++ trunk/engine/src/main/java/org/teiid/query/processor/relational/AccessNode.java	2011-04-04 22:35:34 UTC (rev 3059)
@@ -31,13 +31,16 @@
 import org.teiid.api.exception.query.QueryValidatorException;
 import org.teiid.client.plan.PlanNode;
 import org.teiid.common.buffer.BlockedException;
+import org.teiid.common.buffer.BufferManager;
 import org.teiid.common.buffer.TupleBatch;
 import org.teiid.common.buffer.TupleSource;
+import org.teiid.common.buffer.BufferManager.BufferReserveMode;
 import org.teiid.core.TeiidComponentException;
 import org.teiid.core.TeiidProcessingException;
 import org.teiid.query.QueryPlugin;
 import org.teiid.query.eval.Evaluator;
 import org.teiid.query.metadata.QueryMetadataInterface;
+import org.teiid.query.processor.ProcessorDataManager;
 import org.teiid.query.rewriter.QueryRewriter;
 import org.teiid.query.sql.lang.Command;
 import org.teiid.query.util.CommandContext;
@@ -57,6 +60,8 @@
 	private boolean isUpdate = false;
     private boolean returnedRows = false;
     protected Command nextCommand;
+    private int reserved;
+    private int schemaSize;
     
     protected AccessNode() {
 		super();
@@ -65,6 +70,13 @@
 	public AccessNode(int nodeID) {
 		super(nodeID);
 	}
+	
+	@Override
+	public void initialize(CommandContext context, BufferManager bufferManager,
+			ProcessorDataManager dataMgr) {
+		super.initialize(context, bufferManager, dataMgr);
+    	this.schemaSize = getBufferManager().getSchemaSize(getOutputElements());
+	}
 
     public void reset() {
         super.reset();
@@ -181,6 +193,10 @@
                 	//end of source
                     tupleSource.closeSource();
                     tupleSources.remove(i--);
+            		if (reserved > 0) {
+                    	reserved -= schemaSize;
+                    	getBufferManager().releaseBuffers(schemaSize);
+            		}
                     if (!processCommandsIndividually()) {
                     	registerNext();
                     }
@@ -238,6 +254,9 @@
 			}
 		}
 		tupleSources.add(getDataManager().registerRequest(getContext(), atomicCommand, modelName, connectorBindingId, getID(), limit));
+		if (tupleSources.size() > 1) {
+        	reserved += getBufferManager().reserveBuffers(schemaSize, BufferReserveMode.FORCE);
+		}
 	}
 	
 	protected boolean processCommandsIndividually() {
@@ -249,6 +268,8 @@
     }
     
 	public void closeDirect() {
+    	getBufferManager().releaseBuffers(reserved);
+    	reserved = 0;
 		super.closeDirect();
         closeSources();            
 	}

Modified: trunk/engine/src/main/java/org/teiid/query/processor/relational/DependentCriteriaProcessor.java
===================================================================
--- trunk/engine/src/main/java/org/teiid/query/processor/relational/DependentCriteriaProcessor.java	2011-04-04 22:34:47 UTC (rev 3058)
+++ trunk/engine/src/main/java/org/teiid/query/processor/relational/DependentCriteriaProcessor.java	2011-04-04 22:35:34 UTC (rev 3059)
@@ -38,6 +38,7 @@
 import org.teiid.common.buffer.BlockedException;
 import org.teiid.core.TeiidComponentException;
 import org.teiid.core.TeiidProcessingException;
+import org.teiid.query.optimizer.relational.rules.NewCalculateCostUtil;
 import org.teiid.query.processor.relational.SortUtility.Mode;
 import org.teiid.query.rewriter.QueryRewriter;
 import org.teiid.query.sql.lang.AbstractSetCriteria;
@@ -66,6 +67,10 @@
         Object nextValue;
 
         boolean isNull;
+        
+        float maxNdv = NewCalculateCostUtil.UNKNOWN_VALUE;
+        
+        boolean overMax;
     }
 
     class TupleState {
@@ -96,6 +101,21 @@
             	dvs = new DependentValueSource(sortUtility.sort());
             	for (SetState setState : dependentSetStates) {
                     setState.valueIterator = dvs.getValueIterator(setState.valueExpression);
+                    if (setState.maxNdv > 0 && setState.maxNdv < dvs.getTupleBuffer().getRowCount()) {
+                    	ValueIterator vi = dvs.getValueIterator(setState.valueExpression);
+                    	Comparable last = null;
+                    	int distinctCount = 0;
+                    	while (vi.hasNext()) {
+                    		Comparable next = (Comparable) vi.next();
+                    		if (last == null || next.compareTo(last) != 0) {
+                    			distinctCount++;
+                    		}
+                    		last = next;
+                    	}
+                    	if (distinctCount > setState.maxNdv) {
+                    		setState.overMax = true;
+                    	}
+                    }
     			}
             }
         }
@@ -177,6 +197,7 @@
                     sources.add(ts.getDepedentSetStates());
                 }
                 ts.getDepedentSetStates().add(state);
+                state.maxNdv = dsc.getMaxNdv();
             } 
         }        
     }
@@ -276,21 +297,25 @@
 
 	        	if (i == currentIndex++) {
 		
-		            boolean done = false;
+		            int doneCount = 0;
 		
-		            while (!done) {
+		            while (doneCount < source.size()) {
 		
 		                boolean isNull = false;
 		                boolean lessThanMax = true;
 		
 		                for (SetState state : source) {
+		                	if (state.overMax) {
+		                		doneCount++;
+		                		continue;
+		                	}
 		                    if (state.nextValue == null && !state.isNull) {
 		                        if (state.valueIterator.hasNext()) {
 		                            state.nextValue = state.valueIterator.next();
 		                            state.isNull = state.nextValue == null;
 		                        } else {
 		                            state.valueIterator.reset();
-		                            done = true; // should be true for each iterator from this source
+		                            doneCount++; // should be true for each iterator from this source
 		                            continue;
 		                        }
 		                    }
@@ -299,7 +324,7 @@
 		                    lessThanMax &= state.replacement.size() < maxSize * (run + 1);
 		                }
 		
-		                if (done) {
+		                if (doneCount == source.size()) {
 		                    if (!restartIndexes.isEmpty() && restartIndexes.getLast().intValue() == i) {
 		                        restartIndexes.removeLast();
 		                    }
@@ -316,7 +341,7 @@
 		                    }
 		                } else {
 		                    restartIndexes.add(i);
-		                    done = true;
+		                    break;
 		                }
 		            }
 	        	}
@@ -339,6 +364,9 @@
     }
     
     public Criteria replaceDependentCriteria(AbstractSetCriteria crit, SetState state) {
+    	if (state.overMax) {
+    		return QueryRewriter.TRUE_CRITERIA;
+    	}
     	if (state.replacement.isEmpty()) {
             // No values - return criteria that is always false
             return QueryRewriter.FALSE_CRITERIA;

Modified: trunk/engine/src/main/java/org/teiid/query/processor/relational/EnhancedSortMergeJoinStrategy.java
===================================================================
--- trunk/engine/src/main/java/org/teiid/query/processor/relational/EnhancedSortMergeJoinStrategy.java	2011-04-04 22:34:47 UTC (rev 3058)
+++ trunk/engine/src/main/java/org/teiid/query/processor/relational/EnhancedSortMergeJoinStrategy.java	2011-04-04 22:35:34 UTC (rev 3059)
@@ -136,6 +136,7 @@
     	int rowId = 0;
     	List<?> lastTuple = null;
     	boolean sortedDistinct = sorted && !state.isDistinct();
+    	int sizeHint = index.getExpectedHeight(state.getTupleBuffer().getRowCount());
     	outer: while (its.hasNext()) {
     		//detect if sorted and distinct
     		List<?> originalTuple = its.nextTuple();
@@ -153,7 +154,7 @@
     		if (!state.isDistinct()) {
     			tuple.add(keyLength - 1, rowId++);
     		}
-    		index.insert(tuple, sorted?InsertMode.ORDERED:InsertMode.NEW, state.getTupleBuffer().getRowCount());
+    		index.insert(tuple, sorted?InsertMode.ORDERED:InsertMode.NEW, sizeHint);
     	}
     	if (!sorted) {
     		index.compact();

Modified: trunk/engine/src/main/java/org/teiid/query/sql/lang/DependentSetCriteria.java
===================================================================
--- trunk/engine/src/main/java/org/teiid/query/sql/lang/DependentSetCriteria.java	2011-04-04 22:34:47 UTC (rev 3058)
+++ trunk/engine/src/main/java/org/teiid/query/sql/lang/DependentSetCriteria.java	2011-04-04 22:35:34 UTC (rev 3059)
@@ -24,6 +24,7 @@
 
 import org.teiid.core.util.EquivalenceUtil;
 import org.teiid.core.util.HashCodeUtil;
+import org.teiid.query.optimizer.relational.rules.NewCalculateCostUtil;
 import org.teiid.query.sql.LanguageVisitor;
 import org.teiid.query.sql.symbol.ContextReference;
 import org.teiid.query.sql.symbol.Expression;
@@ -46,7 +47,8 @@
     /**
      * The estimated number of distinct values for the value Expression
      */
-    private float ndv;
+    private float ndv = NewCalculateCostUtil.UNKNOWN_VALUE;
+    private float maxNdv = NewCalculateCostUtil.UNKNOWN_VALUE;
     
     /** 
      * Construct with the left expression 
@@ -61,6 +63,14 @@
     	return id;
     }
     
+    public float getMaxNdv() {
+		return maxNdv;
+	}
+    
+    public void setMaxNdv(float maxNdv) {
+		this.maxNdv = maxNdv;
+	}
+    
     public float getNdv() {
 		return ndv;
 	}
@@ -144,6 +154,7 @@
         }
         criteriaCopy.id = this.id;
         criteriaCopy.ndv = this.ndv;
+        criteriaCopy.maxNdv = this.maxNdv;
         return criteriaCopy;
     }
     

Modified: trunk/engine/src/test/java/org/teiid/common/buffer/TestSTree.java
===================================================================
--- trunk/engine/src/test/java/org/teiid/common/buffer/TestSTree.java	2011-04-04 22:34:47 UTC (rev 3058)
+++ trunk/engine/src/test/java/org/teiid/common/buffer/TestSTree.java	2011-04-04 22:35:34 UTC (rev 3059)
@@ -58,6 +58,25 @@
 		assertNull(map.insert(Arrays.asList(1, String.valueOf(1)), InsertMode.NEW, -1));
 	}
 	
+	@Test public void testUnOrderedInsert() throws TeiidComponentException {
+		BufferManagerImpl bm = BufferManagerFactory.createBufferManager();
+		bm.setProcessorBatchSize(16);
+		
+		ElementSymbol e1 = new ElementSymbol("x");
+		e1.setType(Integer.class);
+		List elements = Arrays.asList(e1);
+		STree map = bm.createSTree(elements, "1", 1);
+		
+		int size = (1<<16)+(1<<4)+1;
+		int logSize = map.getExpectedHeight(size);
+
+		for (int i = 0; i < size; i++) {
+			assertNull(map.insert(Arrays.asList(i), InsertMode.NEW, logSize));
+			assertEquals(i + 1, map.getRowCount());
+		}
+		assertTrue(5 >= map.getHeight());
+	}
+
 	@Test public void testOrderedInsert() throws TeiidComponentException {
 		BufferManagerImpl bm = BufferManagerFactory.createBufferManager();
 		bm.setProcessorBatchSize(16);

Modified: trunk/engine/src/test/java/org/teiid/query/processor/TestDependentJoins.java
===================================================================
--- trunk/engine/src/test/java/org/teiid/query/processor/TestDependentJoins.java	2011-04-04 22:34:47 UTC (rev 3058)
+++ trunk/engine/src/test/java/org/teiid/query/processor/TestDependentJoins.java	2011-04-04 22:35:34 UTC (rev 3059)
@@ -25,12 +25,14 @@
 import static org.junit.Assert.*;
 
 import java.util.Arrays;
+import java.util.HashSet;
 import java.util.List;
 
 import org.junit.Test;
 import org.teiid.core.TeiidComponentException;
 import org.teiid.core.TeiidProcessingException;
 import org.teiid.query.optimizer.TestOptimizer;
+import org.teiid.query.optimizer.TestOptimizer.ComparisonMode;
 import org.teiid.query.optimizer.capabilities.BasicSourceCapabilities;
 import org.teiid.query.optimizer.capabilities.FakeCapabilitiesFinder;
 import org.teiid.query.optimizer.capabilities.SourceCapabilities.Capability;
@@ -810,4 +812,46 @@
        TestProcessor.helpProcess(plan, dataManager, expected);
    }    
     
+    @Test public void testDependentJoinBackoff() throws Exception {
+        // Create query 
+        String sql = "SELECT pm1.g1.e1 FROM pm1.g1, pm6.g1 WHERE pm1.g1.e1=pm6.g1.e1"; //$NON-NLS-1$
+
+        // Construct data manager with data
+        FakeDataManager dataManager = new FakeDataManager();
+        sampleData4(dataManager);
+
+        FakeMetadataFacade fakeMetadata = FakeMetadataFactory.example1();
+
+        FakeMetadataFactory.setCardinality("pm1.g1", 1, fakeMetadata);
+        FakeMetadataFactory.setCardinality("pm6.g1", 1000, fakeMetadata);
+        // Plan query
+        FakeCapabilitiesFinder capFinder = new FakeCapabilitiesFinder();
+        BasicSourceCapabilities depcaps = new BasicSourceCapabilities();
+        depcaps.setCapabilitySupport(Capability.CRITERIA_IN, true);
+        depcaps.setSourceProperty(Capability.MAX_IN_CRITERIA_SIZE, new Integer(1));
+        depcaps.setCapabilitySupport(Capability.QUERY_ORDERBY, true);
+
+        BasicSourceCapabilities caps = new BasicSourceCapabilities();
+        caps.setCapabilitySupport(Capability.CRITERIA_IN, true);
+
+        capFinder.addCapabilities("pm1", caps); //$NON-NLS-1$
+        capFinder.addCapabilities("pm6", depcaps); //$NON-NLS-1$
+
+        List[] expected = new List[] {
+            Arrays.asList(new Object[] {
+                new String("b")})}; //$NON-NLS-1$
+
+        ProcessorPlan plan = TestOptimizer.helpPlan(sql, fakeMetadata, new String[] {
+        		"SELECT pm6.g1.e1 FROM pm6.g1 WHERE pm6.g1.e1 IN (<dependent values>) ORDER BY pm6.g1.e1",
+        		"SELECT pm1.g1.e1 FROM pm1.g1"
+        }, capFinder, ComparisonMode.EXACT_COMMAND_STRING);
+
+        // Run query
+        TestProcessor.helpProcess(plan, dataManager, expected);
+        
+        //note that the dependent join was not actually performed
+        assertEquals(new HashSet<String>(Arrays.asList("SELECT pm1.g1.e1 FROM pm1.g1", "SELECT pm6.g1.e1 FROM pm6.g1 ORDER BY pm6.g1.e1")), 
+        		new HashSet<String>(dataManager.getQueries()));
+    }
+    
 }

Modified: trunk/engine/src/test/java/org/teiid/query/processor/TestProcessor.java
===================================================================
--- trunk/engine/src/test/java/org/teiid/query/processor/TestProcessor.java	2011-04-04 22:34:47 UTC (rev 3058)
+++ trunk/engine/src/test/java/org/teiid/query/processor/TestProcessor.java	2011-04-04 22:35:34 UTC (rev 3059)
@@ -7650,5 +7650,5 @@
         helpProcess(plan, dataManager, expected);
     }
     
-    private static final boolean DEBUG = false;
+    private static final boolean DEBUG = true;
 }

Modified: trunk/engine/src/test/java/org/teiid/query/processor/relational/TestAccessNode.java
===================================================================
--- trunk/engine/src/test/java/org/teiid/query/processor/relational/TestAccessNode.java	2011-04-04 22:34:47 UTC (rev 3058)
+++ trunk/engine/src/test/java/org/teiid/query/processor/relational/TestAccessNode.java	2011-04-04 22:35:34 UTC (rev 3059)
@@ -61,7 +61,7 @@
         BufferManager bm = BufferManagerFactory.getStandaloneBufferManager();
         FakeDataManager dataManager = new FakeDataManager(); 
         TestProcessor.sampleData1(dataManager);
-        
+        node.setElements(command.getProjectedSymbols());
         node.initialize(context, bm, dataManager);
         node.setShouldEvaluateExpressions(true);
         // Call open()
@@ -99,6 +99,7 @@
         BufferManager bm = BufferManagerFactory.getStandaloneBufferManager();
         FakeDataManager dataManager = new FakeDataManager(); 
         TestProcessor.sampleData1(dataManager);
+        node.setElements(query.getProjectedSymbols());
         node.initialize(context, bm, dataManager);
         // Call open()
         node.open();



More information about the teiid-commits mailing list