[
https://issues.jboss.org/browse/TEIID-2884?page=com.atlassian.jira.plugin...
]
Kylin Soong edited comment on TEIID-2884 at 8/3/15 5:30 AM:
------------------------------------------------------------
Amazon EMR is a platform that integrate all Hadoop ecosystem products to Amazon, for Big
Data processing, analyzing, ETL, etc.
For Hadoop ecosystem products like HBase, Hive, Spark, Impala, etc, we already have
relevant translators, also Amazon EMR have supplied some compatiable driver, downloadable
image, document, etc.
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/HiveJDB...
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/impala-...
I think Teiid translator can wrok well against these products in Amazon EMR(Not test so
far).
Excepts integrate Hadoop ecosystem products, Amazon EMR add some more features, like EMR
can use Amazon S3 to store input data, log files, and output data.
For S3, I do wonder whether we need to develop a tranlator, what mainly stored in S3 is
large picture files, vedio files and big text files, it seems meaningless to develop a
JDBC based translator on top it. Further more, IMO, the work to integrate S3 to enterprise
application should belong to ESB, actually there already have some implmentations:
* Camel S3
Component(http://camel.apache.org/aws-s3.html)
* Mulesoft S3
Connector(https://www.mulesoft.org/connectors/amazon-simple-storage-servi...
* RSSBus/CData
S3(https://www.rssbus.com/solutions/s3/)
Any idea or advice in this issue?
was (Author: kylin):
Amazon EMR is a platform that integrate all Hadoop ecosystem products to Amazon, for Big
Data processing, analyzing, ETL, etc.
For Hadoop ecosystem products like HBase, Hive, Spark, Impala, etc, we already have
relevant translatore, also Amazon EMR have supplied some compatiable driver, downloadable
image, document, etc.
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/HiveJDB...
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/impala-...
I think Teiid translator can wrok well against these products in Amazon EMR(Not test so
far).
Excepts integrate Hadoop ecosystem products, Amazon EMR add some more features, like EMR
can use Amazon S3 to store input data, log files, and output data.
For S3, I do wonder whether we need to develop a tranlator, what mainly stored in S3 is
large picture files, vedio files and big text files, it seems meaningless to develop a
JDBC based translator on top it. Further more, IMO, the work to integrate S3 to enterprise
application should belong to ESB, actually there already have some implmentations:
* Camel S3
Component(http://camel.apache.org/aws-s3.html)
* Mulesoft S3
Connector(https://www.mulesoft.org/connectors/amazon-simple-storage-servi...
* RSSBus/CData
S3(https://www.rssbus.com/solutions/s3/)
Any idea or advice in this issue?
Support for Amazon Elastic MapReduce
------------------------------------
Key: TEIID-2884
URL:
https://issues.jboss.org/browse/TEIID-2884
Project: Teiid
Issue Type: Feature Request
Components: Misc. Connectors
Reporter: Van Halbert
Assignee: Kylin Soong
Fix For: 8.12
Amazon Elastic MapReduce
from
http://en.wikipedia.org/wiki/Amazon_Elastic_MapReduce#Amazon_Elastic_MapR...
Elastic MapReduce (EMR)was introduced by Amazon in April 2009. Provisioning of the Hadoop
cluster, running and terminating jobs, and handling data transfer between EC2 and S3 are
automated by Elastic MapReduce. Apache Hive, which is built on top of Hadoop for providing
data warehouse services, is also offered in Elastic MapReduce.
...
In June 2012, premium options for EMR were added that replace ordinary Hadoop with
MapR's M3 and M5 versions. These options provide additional capabilities over and
above what the default EMR offering provides.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)