[JBoss JIRA] (TEIID-2884) Support for Amazon Elastic MapReduce

Monday, 3 August 2015

    [
https://issues.jboss.org/browse/TEIID-2884?page=com.atlassian.jira.plugin...
] 

Kylin Soong edited comment on TEIID-2884 at 8/3/15 5:30 AM:
------------------------------------------------------------

Amazon EMR is a platform that integrate all Hadoop ecosystem products to Amazon, for Big
Data processing, analyzing, ETL, etc. 

For Hadoop ecosystem products like HBase, Hive, Spark, Impala, etc, we already have
relevant translators, also Amazon EMR have supplied some compatiable driver, downloadable
image, document, etc.
	http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/HiveJDB...
	http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/impala-...
I think Teiid translator can wrok well against these products in Amazon EMR(Not test so
far).

Excepts integrate Hadoop ecosystem products, Amazon EMR add some more features, like EMR
can use Amazon S3 to store input data, log files, and output data. 

For S3, I do wonder whether we need to develop a tranlator, what mainly stored in S3 is
large picture files, vedio files and big text files, it seems meaningless to develop a
JDBC based translator on top it. Further more, IMO, the work to integrate S3 to enterprise
application should belong to ESB, actually there already have some implmentations:

* Camel S3 Component(http://camel.apache.org/aws-s3.html)
* Mulesoft S3
Connector(https://www.mulesoft.org/connectors/amazon-simple-storage-servi...
* RSSBus/CData S3(https://www.rssbus.com/solutions/s3/)

Any idea or advice in this issue?

was (Author: kylin):
Amazon EMR is a platform that integrate all Hadoop ecosystem products to Amazon, for Big
Data processing, analyzing, ETL, etc. 

For Hadoop ecosystem products like HBase, Hive, Spark, Impala, etc, we already have
relevant translatore, also Amazon EMR have supplied some compatiable driver, downloadable
image, document, etc.
	http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/HiveJDB...
	http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/impala-...
I think Teiid translator can wrok well against these products in Amazon EMR(Not test so
far).

Excepts integrate Hadoop ecosystem products, Amazon EMR add some more features, like EMR
can use Amazon S3 to store input data, log files, and output data. 

For S3, I do wonder whether we need to develop a tranlator, what mainly stored in S3 is
large picture files, vedio files and big text files, it seems meaningless to develop a
JDBC based translator on top it. Further more, IMO, the work to integrate S3 to enterprise
application should belong to ESB, actually there already have some implmentations:

* Camel S3 Component(http://camel.apache.org/aws-s3.html)
* Mulesoft S3
Connector(https://www.mulesoft.org/connectors/amazon-simple-storage-servi...
* RSSBus/CData S3(https://www.rssbus.com/solutions/s3/)

Any idea or advice in this issue?

...
 Support for Amazon Elastic MapReduce
 ------------------------------------

                 Key: TEIID-2884
                 URL: https://issues.jboss.org/browse/TEIID-2884
             Project: Teiid
          Issue Type: Feature Request
          Components: Misc. Connectors
            Reporter: Van Halbert
            Assignee: Kylin Soong
             Fix For: 8.12

 Amazon Elastic MapReduce
 from http://en.wikipedia.org/wiki/Amazon_Elastic_MapReduce#Amazon_Elastic_MapR...
 Elastic MapReduce (EMR)was introduced by Amazon in April 2009. Provisioning of the Hadoop
cluster, running and terminating jobs, and handling data transfer between EC2 and S3 are
automated by Elastic MapReduce. Apache Hive, which is built on top of Hadoop for providing
data warehouse services, is also offered in Elastic MapReduce.
 ...
 In June 2012, premium options for EMR were added that replace ordinary Hadoop with
MapR's M3 and M5 versions. These options provide additional capabilities over and
above what the default EMR offering provides. 

--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009