[teiid-commits] teiid SVN: r2470 - in trunk: documentation/client-developers-guide/src/main/docbook/en-US/content and 3 other directories.

Tue Aug 17 12:26:30 EDT 2010

Author: shawkins
Date: 2010-08-17 12:26:28 -0400 (Tue, 17 Aug 2010)
New Revision: 2470

Modified:
   trunk/documentation/caching-guide/src/main/docbook/en-US/content/cachehint.xml
   trunk/documentation/caching-guide/src/main/docbook/en-US/content/matviews.xml
   trunk/documentation/caching-guide/src/main/docbook/en-US/content/overview.xml
   trunk/documentation/client-developers-guide/src/main/docbook/en-US/content/jdbc-connection.xml
   trunk/documentation/reference/src/main/docbook/en-US/content/system_schema.xml
   trunk/engine/src/main/java/org/teiid/dqp/internal/datamgr/ConnectorWorkItem.java
   trunk/engine/src/test/java/org/teiid/common/queue/TestThreadReuseExecutor.java
Log:
TEIID-168 completed the matview doc.

Modified: trunk/documentation/caching-guide/src/main/docbook/en-US/content/cachehint.xml
===================================================================

--- trunk/documentation/caching-guide/src/main/docbook/en-US/content/cachehint.xml	2010-08-17 16:09:54 UTC (rev 2469)
+++ trunk/documentation/caching-guide/src/main/docbook/en-US/content/cachehint.xml	2010-08-17 16:26:28 UTC (rev 2470)
@@ -54,7 +54,16 @@
 	 		<para>When the ttl is specified in the cache hint, a full refresh of the materialized view will be triggered automatically after the specified time interval.  
 	 		The refresh is equivalent to <code>CALL SYS.refreshMatView('view name', false)</code>, but performed asynchronously so that user queries do not block on the load.
 	 		</para>
- 			<para>The automatic loading is not intended for complex loading scenarios, as nested materialized views will be used by the refresh query.</para>
+	 		<section>
+	 			<title>Limitations</title>
+	 			<itemizedlist>
+			   		<listitem><para>The automatic ttl refresh is not intended for complex loading scenarios, as nested materialized views will be used by the refresh query.
+			   		</para></listitem>
+			   		<listitem><para>The automatic ttl refresh is performed lazily, that is it is only trigger by using the table after the ttl has expired.  
+			   		For infrequently used tables with long load times, this means that data may be used well past the intended ttl. 
+			   		</para></listitem>
+			  	</itemizedlist>
+	 		</section>
  		</section>
  		<section>
  			<title>Updatable</title>

Modified: trunk/documentation/caching-guide/src/main/docbook/en-US/content/matviews.xml
===================================================================
--- trunk/documentation/caching-guide/src/main/docbook/en-US/content/matviews.xml	2010-08-17 16:09:54 UTC (rev 2469)
+++ trunk/documentation/caching-guide/src/main/docbook/en-US/content/matviews.xml	2010-08-17 16:26:28 UTC (rev 2470)
@@ -51,8 +51,8 @@
 			<programlisting>SELECT * from vg1, vg2, vg3 WHERE … OPTION NOCACHE vg1, vg3</programlisting>
 			<para>Only the vg1 and vg3 caches will be skipped vg2 or any materialized views nested under vg1 and vg3 will be used.</para>
 		</example>
-		<para>The materialization override option may be specified in virtual
-			group transformation definitions.  In that way, transformations can
+		<para>Option NOCACHE may be specified in virtual
+			group transformation queries.  In that way, transformations can
 			specify to always use real-time data obtained directly from a source.
 			 The use of caching and non-caching can be mixed in transformation
 			definitions, just as with user queries.</para>
@@ -67,9 +67,9 @@
 		<note><para>It is important to ensure that all key/index information is present as these will be used by the materialization process to enhance the performance of the materialized table.</para></note>
 		<para>The target materialized table may also be set in the properties.  If the value is left blank, the default, then internal materialization will be used.  
 		Otherwise for external materialization, the value should reference the fully qualified name of a table (or possibly view) with the same columns as the materialized view.
-		The choice between internal and external materialization should be made based upon:
+		For most basic scenarios the simplicity of internal materialization makes it the more appealing option. Other considerations for chosing between internal and external materialization are:
 		<itemizedlist>
-			<listitem><para>Does the cached data need to be fully persistent?  If yes, then external materialization should be used.  
+			<listitem><para>Does the cached data need to be fully durable?  If yes, then external materialization should be used.  
 			Internal materialization should not survive a cluster restart.</para>
 			</listitem>
 			<listitem><para>Is full control needed of loading and refresh?  If yes, then external materialzation should be used.  
@@ -77,7 +77,6 @@
 			</listitem>
 		</itemizedlist>
 		</para>
-		
 	</section>
 	<section>
 		<title>External Materialization</title>
@@ -88,6 +87,74 @@
 			policy for clearing and managing the cache.  These policies will be
 			defined and enforced by administrators of the Teiid system.
 		</para>
-		<para></para>
+		<orderedlist>
+			<title>Typical Usage Steps</title>
+			<listitem>
+				<para>Create materialized views and corresponding physical materialized target tables in Designer.  This can be done through setting the materialized and target table manually, 
+				or by selecting the desired views, right clicking, then selecting Modeling->"Create Materialized Views"</para>
+			</listitem>
+			<listitem>
+				<para>Generate the DDL for your physical model materialization target tables.  This can be done by selecting the model, right clicking, then choosing Export->"Metadata Modeling"->"Data Definition Language (DDL) File".
+				This script can be used to create the desired schema for your materialization target on whatever source you choose.
+				</para>
+			</listitem>
+			<listitem>
+				<para>Determine a load and refresh strategy.  With the schema created the most simplistic approach is to just load the data.  
+				The load can even be done through Teiid with <code>insert into target_table select * from matview option nocache</code>.
+				That however may be too simplistic because you index creation may be more performant if deferred until after the table has been created.  
+				Also full snapshot refreshes are best done to a staging table then swapping it for the existing physical table to ensure that the refresh
+				 does not impact user queries and to ensure that the table is valid prior to use.</para>
+			</listitem>
+		</orderedlist>
 	</section>
+	<section>
+		<title>Internal Materialization</title>
+		<para>Internal materialization creates Teiid temporary tables to hold the materialized table.  While these tables are not fully durable, they perform 
+		well in most circumstances and the data is present at each Teiid instance which removes the single point of failure and network overhead of an external database. 
+		Internal materialization also provides more built-in facilities for refreshing and monitoring.</para>
+		<section>
+			<title>Loading And Refreshing</title>
+			<para>An internal materialized view table is initially in an invalid state (there is no data).  The first user query will trigger an implicit loading of the data.  
+			All other queries against the materialized view will block until the load completes.
+			In some situations administrators may wish to better control when the cache is loaded with a call to <code>SYS.refreshMatView</code>.  The initial load may itself trigger the initial load
+			of dependent materialized views.  After the initial load user queries against the materialized view table will only block if it is in an invalid state.
+			The valid state may also be controled through the <code>SYS.refreshMatView</code> procedure.
+			<example>
+				<title>Invalidating Refresh</title>
+				<programlisting>CALL SYS.refreshMatView(viewname=>'schema.matview', invalidate=>true)</programlisting>
+				<para>matview will be refreshed and user queries will block until the refresh is complete (or fails).</para>
+			</example>
+			While the initial load may trigger a transitive loading of dependent materialized views, 
+			subsequent refreshes performed with <code>refreshMatView</code> will use dependent materialized view tables if they exist.  Only one load may occur at a time.  If a load is already in progress when
+			the <code>SYS.refreshMatView</code> procedure is called, it will return -1 immediately rather than preempting the current load.
+			</para>
+			<para>The <link linkend="cache-hint">cache hint</link> may be used to automatically trigger a full snapshot refresh after a specified time to live.
+			<example>
+				<title>Auto-refresh Transformation Query</title>
+				<programlisting>/*+ cache(ttl:3600000) */ select t.col, t1.col from t, t1 where t.id = t1.id</programlisting>
+			</example>
+			</para>
+			<para>In advanced use-cases the <link linkend="cache-hint">cache hint</link> may also be used to mark an internal materialized view as updatable.
+			An updatable internal materialized view may use the <code>SYS.refreshMatViewRow</code> procedure to update a single row in the materialized table.
+			To be updatable the materialized view must have a single column primary key.  Composite keys are not yet supported by <code>SYS.refreshMatViewRow</code>.
+			<example>
+				<title>Updatable Scenario</title>
+				<para>Transofrmation Query: <programlisting>/*+ cache(updatable) */ select t.col, t1.col from t, t1 where t.id = t1.id</programlisting></para>
+				<para>Update: <programlisting>CALL SYS.updateMatViewRow(viewname=>'schema.matview', key=>5)</programlisting></para>
+				<para>Given that the schema.matview defines interger column col as it's primary key, the update will check the live source(s) for the row values.  
+				If it exists, the materialized view table row will be updated.  If it does not exist the correpsonding row will be deleted.</para>
+			</example>
+			The update query will not use dependent materialized view tables, so care should be taken to ensure that getting a single 
+			row from this transformation query performs well.  This may require the use of depedent join hints.
+			When the updatable option is not specified, accessing the materialized view table is more efficient because modifications do not need to be considered.  
+ 			Therefore, only specify the updatable option if row based incremental updates are needed.  Even when performing row updates, full snapshot refreshes may be needed to ensure consistency.
+			</para>
+		</section>
+		<section>
+			<title>Limitations</title>
+			<itemizedlist>
+				<listitem><para>Secondary index information is currently not used.  An index is only created for the primary key.</para></listitem>
+			</itemizedlist>
+		</section>
+	</section>
 </chapter>
\ No newline at end of file

Modified: trunk/documentation/caching-guide/src/main/docbook/en-US/content/overview.xml
===================================================================
--- trunk/documentation/caching-guide/src/main/docbook/en-US/content/overview.xml	2010-08-17 16:09:54 UTC (rev 2469)
+++ trunk/documentation/caching-guide/src/main/docbook/en-US/content/overview.xml	2010-08-17 16:26:28 UTC (rev 2470)
@@ -9,5 +9,6 @@
 		 These techniques can be used to significantly improve performance in many
 		situations.</para>
 	<para>With the exception of external materialized views, the cached data is accessed through the BufferManager.  
-	For better performance, the BufferManager setting should be adjusted to the memory constraints of your installation.</para>
+	For better performance the BufferManager setting should be adjusted to the memory constraints of your installation.  
+	See the Admin Guide for more on parameter tuning.</para>
 </chapter>

Modified: trunk/documentation/client-developers-guide/src/main/docbook/en-US/content/jdbc-connection.xml
===================================================================
--- trunk/documentation/client-developers-guide/src/main/docbook/en-US/content/jdbc-connection.xml	2010-08-17 16:09:54 UTC (rev 2469)
+++ trunk/documentation/client-developers-guide/src/main/docbook/en-US/content/jdbc-connection.xml	2010-08-17 16:26:28 UTC (rev 2470)
@@ -486,27 +486,29 @@
               
     </section>
     
-    <!-- <section id="multiple_hosts">
+    <section id="multiple_hosts">
         <title>Using Multiple Hosts</title>
-        <para>When Teiid Server is deployed on multiple servers for scalbility, then your application that using
-        Teiid JDBC API can automatically use all Teiid Servers in that group. To enable this feature the client needs 
-        to specify multiple host name and port number combinations on the URL connection string. The client will randomly pick any one the Teiid server from the list and will have session 
-        established with that server.  If the "autoFailover" connection property is set to "true", a failure with the connected server will cause the client to automatically failover 
+        <para>A group of Teiid Servers in the same AS cluster may be connected using connection time failover and load-balancing.
+        To enable this feature the client needs to specify multiple host name and port number combinations on the URL connection string. 
+        The client will randomly pick any one the Teiid server from the list and will have session established with that server.  
+        If that server cannot be contacted, then a connection will be attempted to each of the remaining servers in random order.</para>   
+        
+        <!-- If the "autoFailover" connection property is set to "true", a failure with the connected server will cause the client to automatically failover 
         to other available servers.  Even if autoFailover is not set, when using a managed DataSource based connection, the connection will randomly select a new server instance when it is returned to the pool.</para>
-        
+        -->
         <example><title>Example URL connection string</title><programlisting><![CDATA[jdbc:teiid:&lt;vdb-name&gt;@mm://host1:31000, host1:31001, host2:31000;version=2]]></programlisting></example>        
         
-        <para>Currently when the fail over happens, the user is re-authenticated with the new server. The clustering 
+        <!-- <para>Currently when the fail over happens, the user is re-authenticated with the new server. The clustering 
         feature coming up in the Teiid 7.1 release will define how the transparent session fail over will occur with out the 
         re-authentication.</para>
         
         <para>You can also use this feature to distribute the query load among various avaialble Teiid Servers available. 
         Load balancing happens automatically, when you are using a data source along with connection pooling. Each time a connection is 
         grabbed from the pool, it will randomly select a Teiid Server to distribute the load. Note, that load balacing feature 
-        is not avaialble if you are using Teiid Driver to make your connection.</para>
+        is not avaialble if you are using Teiid Driver to make your connection.</para>-->
         
         <para>If you are using DataSource to connect to Teiid Server, use "AlternateServers" property/method to define the failover servers.
         Check out the Javadoc on the format of the string.</para>
-    </section> -->
+    </section>
     
 </chapter>
\ No newline at end of file

Modified: trunk/documentation/reference/src/main/docbook/en-US/content/system_schema.xml
===================================================================
--- trunk/documentation/reference/src/main/docbook/en-US/content/system_schema.xml	2010-08-17 16:09:54 UTC (rev 2469)
+++ trunk/documentation/reference/src/main/docbook/en-US/content/system_schema.xml	2010-08-17 16:26:28 UTC (rev 2470)
@@ -1776,7 +1776,7 @@
 							<para>(string ViewName, boolean Invalidate)</para>
 						</entry>
 						<entry>
-							<para>An return return value, RowsUpdated.  See the Caching Guide for more.</para>
+							<para>An return return value, RowsUpdated. -1 indicates a load is in progress, otherwise the cardinality of the table is returned.  See the Caching Guide for more.</para>
 						</entry>
 					</row>
 					<row>
@@ -1787,7 +1787,7 @@
 							<para>(string ViewName, object Key)</para>
 						</entry>
 						<entry>
-							<para>An return return value, RowsUpdated.  See the Caching Guide for more.</para>
+							<para>An return return value, RowsUpdated. -1 indicates the materialized table is currently invalid. 0 indicates that the specified row did not exist in the live data query or in the materialized table.  See the Caching Guide for more.</para>
 						</entry>
 					</row>
 				</tbody>

Modified: trunk/engine/src/main/java/org/teiid/dqp/internal/datamgr/ConnectorWorkItem.java
===================================================================
--- trunk/engine/src/main/java/org/teiid/dqp/internal/datamgr/ConnectorWorkItem.java	2010-08-17 16:09:54 UTC (rev 2469)
+++ trunk/engine/src/main/java/org/teiid/dqp/internal/datamgr/ConnectorWorkItem.java	2010-08-17 16:26:28 UTC (rev 2470)
@@ -287,7 +287,9 @@
             		this.lastBatch = true;
             		break;
             	}
-            	
+            	if (row.size() != this.schema.length) {
+            		throw new AssertionError("Inproper results returned.  Expected " + this.schema.length + " columns, but was " + row.size()); //$NON-NLS-1$ //$NON-NLS-2$
+        		}
             	this.rowCount += 1;
             	batchSize++;
             	if (this.procedureBatchHandler != null) {
@@ -358,6 +360,7 @@
 				}
 				if (value == result && !DataTypeManager.DefaultDataClasses.OBJECT.equals(this.schema[index])) {
 					convertToRuntimeType.remove(i);
+					continue;
 				}
 				row.set(index, result);
 			}
@@ -375,6 +378,7 @@
 					}
 					if (value == result) {
 						convertToDesiredRuntimeType[i] = false;
+						continue;
 					}
 					row.set(i, result);
 				}

Modified: trunk/engine/src/test/java/org/teiid/common/queue/TestThreadReuseExecutor.java
===================================================================
--- trunk/engine/src/test/java/org/teiid/common/queue/TestThreadReuseExecutor.java	2010-08-17 16:09:54 UTC (rev 2469)
+++ trunk/engine/src/test/java/org/teiid/common/queue/TestThreadReuseExecutor.java	2010-08-17 16:26:28 UTC (rev 2470)
@@ -73,7 +73,7 @@
         	pool.execute(new FakeWorkItem(SINGLE_WAIT));
             
             try {
-                Thread.sleep(SINGLE_WAIT*2);
+                Thread.sleep(SINGLE_WAIT*3);
             } catch(InterruptedException e) {                
             }
         }