... |
MongoDB is a document based "schema-less" database with it own query language - it does not map perfectly with relational concepts or the SQL query language. More and more systems are using a MongoDB kind of NOSQL store for scalability and performance. For example, applications like storing audit logs or managing web site data fits well with MongoDB, and does not require using a structural database like Oracle, Postgres ect. MongoDB uses JSON documents as its primary storage unit, and it can have additional embedded documents inside the parent document. By using embedded documents it co-locates the related information to achieves de-normalization that typically requires either duplicate data or joins to achieve in a relational database. |
To make MongoDB work with Teiid the challenge for the MongoDB translator is "How best one can to design a MongoDB store that can achieve the balance between relational and document based storage?" In our opinion the advantages of "schema-less" design are great at development time. "Schema-less" can also be a problem with migration of application versions and the ability to query and make use of returned information effectively. |
Since it is hard and may be impossible in certain situations to derive a schema based on existing the MongoDB collection(s), Teiid approaches the problem in reverse compared to other translators. When working with MongoDB, Teiid requires the user to define the MongoDB schema upfront using Teiid metadata. Since Teiid only allows relational schema as its metadata, the user needs to define their MongoDB schema in relational terms using tables, procedures, and functions. For the purposes of MongoDB, the Teiid metadata has been extended to support extension properties that can be defined on the table to convert it into a MongoDB based document. These extension properties let users define, how a MongoDB document is structured and stored. Based on the relationships (primary-key, foreign-key) defined on a table, and the cardinality (ONE-to-ONE, ONE-to-MANY, MANY-to-ONE) relations between tables are mapped such that related information can be embedded along with the parent document for co-location (see the de-normalization comment above). Thus a relational schema based design, but document based storage in MongoDB. Currently direct mapping of MANY-to-MANY is not supported. |
... |
{warning:title=Usage} |
Please note a given table can contain "teiid_mongo:EMBEDDABLE" property or "teiid_mongo:MERGE" defining the type of nesting in the MongoDB. Table is not allowed to have both properties at sametime. |
Please note a given table can contain either the "teiid_mongo:EMBEDDABLE" property or the "teiid_mongo:MERGE" property defining the type of nesting in MongoDB. A table is not allowed to have both properties. |
{warning} |
... |
The MongoDB translator provides a relational view of data that resides in a MongoDB database. This translator is capable of converting Teiid SQL queries into MongoDB based queries. It supports a full range of SELECT, INSERT, UPDATE and DELETE calls.
MongoDB is a document based "schema-less" database with it own query language - it does not map perfectly with relational concepts or the SQL query language. More and more systems are using a MongoDB kind of NOSQL store for scalability and performance. For example, applications like storing audit logs or managing web site data fits well with MongoDB, and does not require using a structural database like Oracle, Postgres ect. MongoDB uses JSON documents as its primary storage unit, and it can have additional embedded documents inside the parent document. By using embedded documents it co-locates the related information to achieves de-normalization that typically requires either duplicate data or joins to achieve in a relational database.
To make MongoDB work with Teiid the challenge for the MongoDB translator is "How to design a MongoDB store that can achieve the balance between relational and document based storage?" In our opinion the advantages of "schema-less" design are great at development time. "Schema-less" can also be a problem with migration of application versions and the ability to query and make use of returned information effectively.
Since it is hard and may be impossible in certain situations to derive a schema based on existing the MongoDB collection(s), Teiid approaches the problem in reverse compared to other translators. When working with MongoDB, Teiid requires the user to define the MongoDB schema upfront using Teiid metadata. Since Teiid only allows relational schema as its metadata, the user needs to define their MongoDB schema in relational terms using tables, procedures, and functions. For the purposes of MongoDB, the Teiid metadata has been extended to support extension properties that can be defined on the table to convert it into a MongoDB based document. These extension properties let users define, how a MongoDB document is structured and stored. Based on the relationships (primary-key, foreign-key) defined on a table, and the cardinality (ONE-to-ONE, ONE-to-MANY, MANY-to-ONE) relations between tables are mapped such that related information can be embedded along with the parent document for co-location (see the de-normalization comment above). Thus a relational schema based design, but document based storage in MongoDB. Currently direct mapping of MANY-to-MANY is not supported.
The above may not satisfy every user's needs. The document structure in MongoDB can be more complex than what Teiid can currently define. We hope this will eventually catch up in future versions of Teiid. This is currently designed for:
1. Users that are using relational databases and would like to move/migrate their data to MongoDB to take advantages scaling and performance with out modifying the end user applications currently running.
2. Users that are starting out with MongoDB and do not have experience with MongoDB, but are seasoned SQL developers. This provides a low barrier of entry compared to using MongoDB directly as an application developer.
3. Integrate other enterprise data sources with MongoDB based data.
The name of the translator to use in vdb.xml is "mongodb".
For example:
<vdb name="nothwind" version="1"> <model name="northwind"> <source name="local" translator-name="mongodb" connection-jndi-name="java:/mongoDS"/> </model> <vdb>
The translator does not provide a connection to the MongoDB. For that purpose, Teiid has a JCA adapter that provides a connection to MongoDB using the MongoDB Java Driver. To define such connector, use the following XML fragment in standalone-teiid.xml. See a example in "<jboss-as>/docs/teiid/datasources/mongodb"
<resource-adapters> <resource-adapter id="mongodb"> <module slot="main" id="org.jboss.teiid.resource-adapter.mongodb"/> <transaction-support>NoTransaction</transaction-support> <connection-definitions> <connection-definition class-name="org.teiid.resource.adapter.mongodb.MongoDBManagedConnectionFactory" jndi-name="java:/mongoDS" enabled="true" use-java-context="true" pool-name="teiid-mongodb-ds"> <!-- MongoDB server list (host:port[;host:port...]) --> <config-property name="RemoteServerList">localhost:27017</config-property> <!-- Database Name in the MongoDB --> <config-property name="Database">test</config-property> <!-- Uncomment these properties to supply user name and password <config-property name="Username">user</config-property> <config-property name="Password">user</config-property> --> </connection-definition> </connection-definitions> </resource-adapter> </resource-adapters>
The above defines the translator and connector. However as mentioned the MongoDB translator can NOT derive the metadata based on existing document collections - the user MUST define the metadata. For example, you can define a schema using DDL:
<vdb name="nothwind" version="1"> <model name="northwind"> <source name="local" translator-name="mongodb" connection-jndi-name="java:/mongoDS"/> <metadata type="DDL"><![CDATA[ CREATE FOREIGN TABLE Customer ( customer_id integer, FirstName varchar(25), LastName varchar(25), ) OPTIONS(UPDATABLE 'TRUE'); ]]> </metadata> </model> <vdb>
when INSERT operation below executed against table using Teiid,
INSERT INTO Customer(customer_id, FirstName, LastName) VALUES (1, 'John', 'Doe');
MongoDB translator will create a below document in the MongoDB
{ _id: ObjectID("509a8fb2f3f4948bd2f983a0"), customer_id: 1, FirstName: "John", LastName: "Doe" }
If a PRIMARY KEY is defined on the table as
CREATE FOREIGN TABLE Customer ( customer_id integer PRIMARY KEY, FirstName varchar(25), LastName varchar(25), ) OPTIONS(UPDATABLE 'TRUE');
then that column name is automatically used as "_id" field in the MongoDB collection, then document structure is stored in the MongoDB as
{ _id: 1, FirstName: "John", LastName: "Doe" }
If you defined the composite PRIMARY KEY on Customer table as
CREATE FOREIGN TABLE Customer ( customer_id integer, FirstName varchar(25), LastName varchar(25), PRIMARY KEY (FirstName, LastName) ) OPTIONS(UPDATABLE 'TRUE');
the document structure will be
{ _id: { FirstName: "John", LastName: "Doe" }, customer_id: 1, }
MongoDB translator supports automatic mapping of Teiid data types into MongoDB data types, including the support for Blobs, Clobs and XML. The LOB support is based on GridFS in MonogoDB. MongoDB Arrays, Regular Expressions, MongoDB::Code, MongoDB::MinKey, MongoDB::MaxKey, MongoDB::OID is not currently supported.
Using the above DDL or any other metadata facility, a user can map a table in a relational store into a document in MongoDB, however to make effective use of MongoDB, you need to be able to build complex documents, that can co-locate related information, so that data can queried in a single MongoDB query. Otherwise, since MongoDB does not support join relationships like relational database, you need to issue multiple queries to retrieve and join data manually. The power of MongoDB comes from its "embedded" documents and its support of complex data types like arrays and use of the aggregation framework to be able to query them. This translator provides way to achieve that goals.
When you do not define the complex embedded documents in MongoDB, Teiid can step in for join processing and provide that functionality, however if you want to make use of the power of MongoDB itself in querying the data and avoid bringing the unnecessary data and improve performance, you need to look into building these complex documents.
MongoDB translator defines two additional metadata properties along with other Teiid metadata properties to aid in building the complex "embedded" documents. You can use the following metadata properties in your DDL.
Using the above properties and FOREIGN KEY relationships, we will illustrate how to build complex documents in MongoDB.
DBRefs All the key references are automatically stored as "DBRef" values. The examples below were stripped for clarity reasons. |
Usage Please note a given table can contain either the "teiid_mongo:EMBEDDABLE" property or the "teiid_mongo:MERGE" property defining the type of nesting in MongoDB. A table is not allowed to have both properties. |
If your current DDL structure representing ONE-2-ONE relationship is like
CREATE FOREIGN TABLE Customer ( CustomerId integer PRIMARY KEY, FirstName varchar(25), LastName varchar(25), ) OPTIONS(UPDATABLE 'TRUE'); CREATE FOREIGN TABLE Address ( CustomerId integer, Street varchar(50), City varchar(25), State varchar(25), Zipcode varchar(6), FOREIGN KEY (CustomerId) REFERENCES Customer (CustomerId), ) OPTIONS(UPDATABLE 'TRUE');
by default, this will produce two different collections in MongoDB, like with sample data it will look like
Customer { _id: 1, FirstName: "John", LastName: "Doe" } Address { _id: ObjectID("..."), CustomerId: 1, Street: "123 Lane" City: "New York", State: "NY" Zipcode: "12345" }
You can enhance the storage in MongoDB to a single collection by using "teiid_mongo:MERGE' extension property on the table's OPTIONS clause
CREATE FOREIGN TABLE Customer ( CustomerId integer PRIMARY KEY, FirstName varchar(25), LastName varchar(25), ) OPTIONS(UPDATABLE 'TRUE'); CREATE FOREIGN TABLE Address ( CustomerId integer, Street varchar(50), City varchar(25), State varchar(25), Zipcode varchar(6), FOREIGN KEY (CustomerId) REFERENCES Customer (CustomerId), ) OPTIONS(UPDATABLE 'TRUE', teiid_mongo:MERGE 'Customer');
this will produce single collection in MongoDB, like
Customer { _id: 1, FirstName: "John", LastName: "Doe", Address: { _id: ObjectID("..."), CustomerId: 1, Street: "123 Lane" City: "New York", State: "NY" Zipcode: "12345" } }
With the above both tables are merged into a single collection that can be queried together using the JOIN clause in the SQL command. Since the existence of child/additional record has no meaning with out parent table using the "teiid_mongo:MERGE" extension property is right choice in this situation.
Typically there can be more than two (2) tables involved in this relationship. If MANY side is only associated single table, then use "teiid_mongo:MERGE" property on MANY side of table and define it as the parent. If associated with with more than single table then use "teiid_mongo:EMBEDDABLE".
For example if you have DDL like
CREATE FOREIGN TABLE Customer ( CustomerId integer PRIMARY KEY, FirstName varchar(25), LastName varchar(25), ) OPTIONS(UPDATABLE 'TRUE'); CREATE FOREIGN TABLE Order ( OrderID integer PRIMARY KEY, CustomerId integer, OrderDate date, Status integer, FOREIGN KEY (CustomerId) REFERENCES Customer (CustomerId) ) OPTIONS(UPDATABLE 'TRUE');
in the above a Single Customer can have MANY Orders. There are two options to define the how we store the MongoDB document. If in your schema, the Customer table's CustomerId is only referenced in Order table (i.e. Customer information used for only Order purposes), you can use
CREATE FOREIGN TABLE Customer ( CustomerId integer PRIMARY KEY, FirstName varchar(25), LastName varchar(25), ) OPTIONS(UPDATABLE 'TRUE'); CREATE FOREIGN TABLE Order ( OrderID integer PRIMARY KEY, CustomerId integer, OrderDate date, Status integer, FOREIGN KEY (CustomerId) REFERENCES Customer (CustomerId) ) OPTIONS(UPDATABLE 'TRUE', teiid_mongo:MERGE 'Customer');
that will produce a single document for Customer table like
{ _id: 1, FirstName: "John", LastName: "Doe", Order: [ { _id: 100, CustomerId: 1, OrderDate: ISODate("2000-01-01T06:00:00Z") Status: 2 }, { _id: 101, CustomerId: 1, OrderDate: ISODate("2001-03-06T06:00:00Z") Status: 5 } ... ] }
If Customer table is referenced in more tables other than Order table, then use "teiid_mongo:EMBEDDABLE" property
CREATE FOREIGN TABLE Customer ( CustomerId integer PRIMARY KEY, FirstName varchar(25), LastName varchar(25), ) OPTIONS(UPDATABLE 'TRUE', teiid_mongo:EMBEDDABLE 'TRUE'); CREATE FOREIGN TABLE Order ( OrderID integer PRIMARY KEY, CustomerId integer, OrderDate date, Status integer, FOREIGN KEY (CustomerId) REFERENCES Customer (CustomerId) ) OPTIONS(UPDATABLE 'TRUE'); CREATE FOREIGN TABLE Comments ( CommentID integer PRIMARY KEY, CustomerId integer, Comment varchar(140), FOREIGN KEY (CustomerId) REFERENCES Customer (CustomerId) ) OPTIONS(UPDATABLE 'TRUE');
This creates three different collections in MongoDB.
Customer { _id: 1, FirstName: "John", LastName: "Doe" } Order { _id: 100, CustomerId: 1, OrderDate: ISODate("2000-01-01T06:00:00Z") Status: 2 Customer: { _id: 1, FirstName: "John", LastName: "Doe" } } Comment { _id: 12, CustomerId: 1, Comment: "This works!!!" Customer: { _id: 1, FirstName: "John", LastName: "Doe" } }
Here as you can see the Customer table contents are embedded along with other table's data where they were referenced. This creates duplicated data where multiple of these embedded documents are managed automatically in the MongoDB.
All the SELECT, INSERT, DELETE operations that are generated against the tables with "teiid_mongo:EMBEDDABLE" property are atomic, except for UPDATES, as there can be multiple operations involved to update all the copies. Since there are no transactions in MongoDB, Teiid plans to provide automatic compensating transaction framework around this in future releases. |
This is same as ONE-2-MANY, see above to define relationships.
A parent table can have multiple "embedded" and as well as "merge" documents inside it, it not limited so either one or other. the document size is only limited by the MongoDB document size, which is 16MB. |
This can also mapped with combination of "teiid_mongo:MERGE" and "teiid_mongo:EMBEDDABLE" properties (partially). For example if DDL looks like
CREATE FOREIGN TABLE Order ( OrderID integer PRIMARY KEY, OrderDate date, Status integer, ) OPTIONS(UPDATABLE 'TRUE'); CREATE FOREIGN TABLE OrderDetail ( OrderID integer, ProductID integer, PRIMARY KEY (OrderID,ProductID), FOREIGN KEY (OrderID) REFERENCES Order (OrderID), FOREIGN KEY (ProductID) REFERENCES Product (ProductID), ) OPTIONS(UPDATABLE 'TRUE'); CREATE FOREIGN TABLE Products ( ProductID integer PRIMARY KEY, ProductName varchar(40), ) OPTIONS(UPDATABLE 'TRUE');
you modify the DDL like below, to have
CREATE FOREIGN TABLE Order ( OrderID integer PRIMARY KEY, OrderDate date, Status integer, ) OPTIONS(UPDATABLE 'TRUE'); CREATE FOREIGN TABLE OrderDetail ( OrderID integer, ProductID integer, PRIMARY KEY (OrderID,ProductID), FOREIGN KEY (OrderID) REFERENCES Order (OrderID), FOREIGN KEY (ProductID) REFERENCES Product (ProductID), ) OPTIONS(UPDATABLE 'TRUE', teiid_mongo:MERGE 'Order'); CREATE FOREIGN TABLE Products ( ProductID integer PRIMARY KEY, ProductName varchar(40), ) OPTIONS(UPDATABLE 'TRUE', teiid_mongo:EMBEDDABLE 'TRUE');
That will produce a document like
{ _id : 10248, OrderDate : ISODate("1996-07-04T05:00:00Z"), Status : 5 OrderDetails : [ { _id : { OrderID : 10248, ProductID : 11 Products : { ProductID: 11 ProductName: "Hammer" } } }, { _id : { OrderID : 10248, ProductID : 14 Products : { ProductID: 14 ProductName: "Screw Driver" } } } ] } Products { { ProductID: 11 ProductName: "Hammer" } { ProductID: 14 ProductName: "Screw Driver" } }
Limitations
CREATE FOREIGN TABLE Customer ( CustomerId integer PRIMARY KEY, FirstName varchar(25), LastName varchar(25), ) OPTIONS(UPDATABLE 'TRUE'); CREATE FOREIGN TABLE Order ( OrderID integer PRIMARY KEY, CustomerId integer, OrderDate date, Status integer, FOREIGN KEY (CustomerId) REFERENCES Customer (CustomerId) OPTIONS (allows-join 'FALSE') ) OPTIONS(UPDATABLE 'TRUE'); with the example above, Teiid will create two collections, however when user issues query such as SELECT OrderID, LastName FROM Order JOIN Customer ON Order.CustomerId = Customer.CustomerId; instead of resulting in error, the JOIN processing will happen in the Teiid engine, without the above property it will result in an error. |
When you use above properties and carefully design the MongoDB document structure, Teiid translator can intelligently collate data based on their co-location and take advantage of it while querying.
MongoDB translator designed on top of the MongoDB aggregation framework, use of MongoDB version that supports this framework is mandatory. Apart from SELECT queries, this translator also supports INSERT, UPDATE and DELETE queries.
This translator supports
example For a full example see https://github.com/teiid/teiid/blob/master/connectors/translator-mongodb/src/test/resources/northwind.ddl |
MongoDB source procedures may be created using the teiid_rel:native-query extension - see Parameterizable Native Queries. The procedure will invoke the native-query similar to a direct procedure call with the benefits that the query is predetermined and that result column types are known, rather than requiring the use of ARRAYTABLE or similar functionality.
This feature is turned off by default because of the security risk this exposes to execute any command against the source. To enable this feature, override the execution property called SupportsDirectQueryProcedure to true. |
By default the name of the procedure that executes the queries directly is called native. Override the execution property DirectQueryProcedureName to change it to another name. |
The MongoDB translator provides a procedure to execute any ad-hoc aggregate query directly against the source without Teiid parsing or resolving. Since the metadata of this procedure's results are not known to Teiid, they are returned as an object array containing single blob at array location one(1). This blob contains the JSON document. XMLTABLE can be used construct tabular output for consumption by client applications.
select x.* from TABLE(call native('city;{$match:{"city":"FREEDOM"}}')) t, xmltable('/city' PASSING JSONTOXML('city', cast(array_get(t.tuple, 1) as BLOB)) COLUMNS city string, state string) x
In the above example, a collection called "city" is looked up with filter that matches the "city" name with "FREEDOM", using "native" procedure and then using the nested tables feature the output is passed to a XMLTABLE construct, where the output from the procedure is sent to a JSONTOXML function to construct a XML then the results of that are exposed in tabular form.
The direct query MUST be in the format
"collectionName;{$pipeline instr}+" |