... |
h3. The Multi-source Column |
When a model is marked as multi-source, the engine will add or use an existing column on each table to represent the source name values. In the above vdb.xml the column would return {{chicago}}, {{la}}, {{newyork}} for each of the respective sources. The name of the column defaults to SOURCE_NAME, but is configurable by setting the vdb model property multisource.columnName. If a column already exists on the table (or an IN procedure parameter) with the same name, the engine will assume that it should represent the multi-source column and it will not be used to retrieve physical data. If the multi-source column is added, it not present, the generated column will be treated as a pseudo column which is not selectable via wildcards (\* nor tbl.*). |
This allows queries like the following: |
... |
{code} |
{note}If you would like to use "SOURCE_NAME" in your transformations to control which sources are accessed or updated, you would manually need to add this extra column on your view table in the Designer. This column will not be automatically added on the source table, when you import the medata from source. A column or IN procedure parameter named source_name in multi-source mode will always be treated as the explicit form of the multi-source source_name column and will no longer be treated as an actual physical column or procedure parameter. |
{note}The pseudo column is not present in your actual metadata, and is therefore not actually added on source tables/procedures when you import the metadata. If you would like to use multi-source column in your transformations to control which sources are accessed or updated and/or want the column reported via metadata facilities, there are several options. If using Designer, you would manually add this extra column. If using Dynamic VDBs, the pseduo-column will already be available. However to make it present in the metadata, you may set the model property multisource.addColumn to true on a multi-source model or if using DDL and want to be more selective, you can manually add it to the DDL. |
{note} |
... |
h3. Stored Procedures |
A physical stored procedures requires the manual addition of a string source_name parameter to allow for specifying which source the procedure is executed on. If the source_name parameter is not added or if named parameters are used and the source_name parameter is allowed to default to a null value, then the procedure will be executed on each source and the results unioned together. |
|
It is not possible to execute procedures that required to return IN/OUT, OUT, or RETURN parameters values on more than 1 source at a time. |
A physical stored procedures requires the addition of a string in parameter matching the multi-source column name to specify which source the procedure is executed on. If the parameter is not present and defaults to a null value, then the procedure will be executed on each source. It is not possible to execute procedures that are required to return IN/OUT, OUT, or RETURN parameters values on more than 1 source. |
{code:lang=SQL|title=Example DDL} |
... |
Multi-source models can be used to quickly access data in multiple sources with homogeneous metadata. When you have multiple instances using identical schema (horizontal sharding), Teiid can help you gather data across all the instances, using "multi-source" models. In this scenario, instead of creating/importing a model for every data source, one source model is defined to represents the schema and is configured with multiple data "sources" underneath it. During runtime when a query issued against this model, the query engine analyzes the information and gathers the required data from all sources configured and gathers the results and provides in a single result. Since all sources utilize the same physical metadata, this feature is most appropriate for accessing the same source type with multiple instances.
To mark a model as multi-source, the model property supports-multi-source-bindings needs set to true in the "vdb.xml" file. The model is then allowed to define more than one source. Here is a code example showing dynamic vdb with single model with multiple sources defined.
<vdb name="vdbname" version="1"> <model visible="true" type="PHYSICAL" name="Customers" path="/Test/Customers.xmi"> <property name="supports-multi-source-bindings" value="true"/> <source name="chicago" translator-name="oracle" connection-jndi-name="chicago-customers"/> <source name="newyork" translator-name="oracle" connection-jndi-name="newyork-customers"/> <source name="la" translator-name="oracle" connection-jndi-name="la-customers"/> </model> </vdb>
Currently the tooling support for managing the multi-source feature is limited, so if you need to use this feature build the VDB as usual in the Teiid Designer and then edit the "vdb.xml" file in the VDB archive using a Text editor to add the additional sources as defined above. You must deploy a separate data source for each source defined in the xml file. |
In the above example, the VDB has a single model called Customers, that has multiple sources (chicago, newyork, and la) that define different instances of data.
When a model is marked as multi-source, the engine will add or use an existing column on each table to represent the source name values. In the above vdb.xml the column would return chicago, la, newyork for each of the respective sources. The name of the column defaults to SOURCE_NAME, but is configurable by setting the model property multisource.columnName. If a column already exists on the table (or an IN procedure parameter) with the same name, the engine will assume that it should represent the multi-source column and it will not be used to retrieve physical data. If the multi-source column is not present, the generated column will be treated as a pseudo column which is not selectable via wildcards (* nor tbl.*).
This allows queries like the following:
select * from table where SOURCE_NAME = 'newyork' update table column=value where SOURCE_NAME='chicago' delete from table where column = x and SOURCE_NAME='la' insert into table (column, SOURCE_NAME) VALUES ('value', 'newyork')
The pseudo column is not present in your actual metadata, and is therefore not actually added on source tables/procedures when you import the metadata. If you would like to use multi-source column in your transformations to control which sources are accessed or updated and/or want the column reported via metadata facilities, there are several options. If using Designer, you would manually add this extra column. If using Dynamic VDBs, the pseduo-column will already be available. However to make it present in the metadata, you may set the model property multisource.addColumn to true on a multi-source model or if using DDL and want to be more selective, you can manually add it to the DDL. |
The planner logically treats a multi-source table as if it were a view containing the union all of the respective source tables. More complex partitioning scenarios, such as heterogeneous sources or list partitioning will require the use of a Federated Optimizations#Partitioned Union.
Most of the federated optimizations available over unions are still applicable in multi-source mode. This includes aggregation pushdown/decomposition, limit pushdown, join partitioning, etc.
A physical stored procedures requires the addition of a string in parameter matching the multi-source column name to specify which source the procedure is executed on. If the parameter is not present and defaults to a null value, then the procedure will be executed on each source. It is not possible to execute procedures that are required to return IN/OUT, OUT, or RETURN parameters values on more than 1 source.
CREATE FOREIGN PROCEDURE PROC (arg1 IN STRING NOT NULL, arg2 IN STRING, SOURCE_NAME IN STRING)
CALL PROC(arg1=>'x', SOURCE_NAME=>'sourceA') EXEC PROC('x', 'y', 'sourceB')
CALL PROC(arg1=>'x') EXEC PROC('x', 'y')