[JBoss JIRA] (TEIID-3601) support larger row counts
by Steven Hawkins (JIRA)
[ https://issues.jboss.org/browse/TEIID-3601?page=com.atlassian.jira.plugin... ]
Steven Hawkins commented on TEIID-3601:
---------------------------------------
https://github.com/shawkins/teiid/commit/a5f0f8689b95324e31221564a7cc308d... adds support for larger row counts for everything except for JDBC - due to the row numbering limitation, but that can be worked around with an extension and a new protocol. Beyond that new facilities would be needed to retrieve larger than max int values for update counts and row counts - which default to returning max int in this change set on overflow.
> support larger row counts
> -------------------------
>
> Key: TEIID-3601
> URL: https://issues.jboss.org/browse/TEIID-3601
> Project: Teiid
> Issue Type: Feature Request
> Components: JDBC Driver, Query Engine
> Reporter: Steven Hawkins
> Assignee: Steven Hawkins
> Fix For: 9.0
>
>
> In some extreme cases intermediate and final result sizes can exceed 2^31 - 1 rows. To support this we would need to make extensive changes:
> In the engine the tuplebuffer and logic related to indexing would need to change to long rather than int - this also touches things like join and insert processing.
> A new protocol version would be needed as resultmessages would need to use long rather than int indexing - however JDBC implicitly assumes int indexing such as with ResultSet.getRow.
> Temp table handling would need to be updated to support table sizes greater than max int.
> From a processing side, although not just related row counts, we would consider increasing the parallelism of the plan. The most fundamental way to do this is to partition source queries such that more data can be read in parallel from the source. This would require extension metadata to indicate the partitioning scheme. To take full advantage of such a change the plan itself would have to be paralellized such that as much processing as possible is performed on each partition (rather than the simple case in multi-source where the data is simply unioned back together in the parent node).
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 4 months
[JBoss JIRA] (TEIID-3594) Change the default for command logging to write User query (start/end)
by Steven Hawkins (JIRA)
[ https://issues.jboss.org/browse/TEIID-3594?page=com.atlassian.jira.plugin... ]
Steven Hawkins resolved TEIID-3594.
-----------------------------------
Fix Version/s: 8.12
Resolution: Done
Changed user events to the info level. However I updated the default config to preserve the 8.11 behavior - that is by default we'll not log any command events and the scripts are set to update to the debug level. It seems better to keep continuity and let it can be a product decision if the scripts should be updated to default to INFO or provide multiple / parameterized options.
> Change the default for command logging to write User query (start/end)
> ----------------------------------------------------------------------
>
> Key: TEIID-3594
> URL: https://issues.jboss.org/browse/TEIID-3594
> Project: Teiid
> Issue Type: Enhancement
> Components: Server
> Reporter: Van Halbert
> Assignee: Steven Hawkins
> Fix For: 8.12
>
>
> The command logging currently writes the user query and the source queries. Would like to have multiple levels for command logging, example:
> - default, log the user query start and end entry
> - debug, log the use query and the source queries
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 4 months
[JBoss JIRA] (TEIID-2884) Support for Amazon Elastic MapReduce
by Ramesh Reddy (JIRA)
[ https://issues.jboss.org/browse/TEIID-2884?page=com.atlassian.jira.plugin... ]
Ramesh Reddy commented on TEIID-2884:
-------------------------------------
There was request for Redis/Memcache based store, any product like that ?
> Support for Amazon Elastic MapReduce
> ------------------------------------
>
> Key: TEIID-2884
> URL: https://issues.jboss.org/browse/TEIID-2884
> Project: Teiid
> Issue Type: Feature Request
> Components: Misc. Connectors
> Reporter: Van Halbert
> Assignee: Kylin Soong
> Fix For: 8.12
>
>
> Amazon Elastic MapReduce
> from http://en.wikipedia.org/wiki/Amazon_Elastic_MapReduce#Amazon_Elastic_MapR...
> Elastic MapReduce (EMR)was introduced by Amazon in April 2009. Provisioning of the Hadoop cluster, running and terminating jobs, and handling data transfer between EC2 and S3 are automated by Elastic MapReduce. Apache Hive, which is built on top of Hadoop for providing data warehouse services, is also offered in Elastic MapReduce.
> ...
> In June 2012, premium options for EMR were added that replace ordinary Hadoop with MapR's M3 and M5 versions. These options provide additional capabilities over and above what the default EMR offering provides.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 4 months
[JBoss JIRA] (TEIID-2884) Support for Amazon Elastic MapReduce
by Kylin Soong (JIRA)
[ https://issues.jboss.org/browse/TEIID-2884?page=com.atlassian.jira.plugin... ]
Kylin Soong edited comment on TEIID-2884 at 8/3/15 5:31 AM:
------------------------------------------------------------
Amazon EMR is a platform that integrate all Hadoop ecosystem products to Amazon, for Big Data processing, analyzing, ETL, etc.
For Hadoop ecosystem products like HBase, Hive, Spark, Impala, etc, we already have relevant translators, also Amazon EMR have supplied some compatiable driver, downloadable image, document, etc.
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/HiveJDB...
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/impala-...
I think Teiid translator can wrok well against these products in Amazon EMR(Not test so far).
Excepts integrate Hadoop ecosystem products, Amazon EMR add some more features, like EMR can use Amazon S3 to store input data, log files, and output data.
For S3, I do wonder whether we need to develop a tranlator, what mainly stored in S3 is large picture files, vedio files and big text files, it seems meaningless to develop a JDBC based translator. Further more, IMO, the work to integrate S3 to enterprise application should belong to ESB, actually there already have some implmentations:
* Camel S3 Component(http://camel.apache.org/aws-s3.html)
* Mulesoft S3 Connector(https://www.mulesoft.org/connectors/amazon-simple-storage-servi...
* RSSBus/CData S3(https://www.rssbus.com/solutions/s3/)
Any idea or advice in this issue?
was (Author: kylin):
Amazon EMR is a platform that integrate all Hadoop ecosystem products to Amazon, for Big Data processing, analyzing, ETL, etc.
For Hadoop ecosystem products like HBase, Hive, Spark, Impala, etc, we already have relevant translators, also Amazon EMR have supplied some compatiable driver, downloadable image, document, etc.
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/HiveJDB...
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/impala-...
I think Teiid translator can wrok well against these products in Amazon EMR(Not test so far).
Excepts integrate Hadoop ecosystem products, Amazon EMR add some more features, like EMR can use Amazon S3 to store input data, log files, and output data.
For S3, I do wonder whether we need to develop a tranlator, what mainly stored in S3 is large picture files, vedio files and big text files, it seems meaningless to develop a JDBC based translator Further more, IMO, the work to integrate S3 to enterprise application should belong to ESB, actually there already have some implmentations:
* Camel S3 Component(http://camel.apache.org/aws-s3.html)
* Mulesoft S3 Connector(https://www.mulesoft.org/connectors/amazon-simple-storage-servi...
* RSSBus/CData S3(https://www.rssbus.com/solutions/s3/)
Any idea or advice in this issue?
> Support for Amazon Elastic MapReduce
> ------------------------------------
>
> Key: TEIID-2884
> URL: https://issues.jboss.org/browse/TEIID-2884
> Project: Teiid
> Issue Type: Feature Request
> Components: Misc. Connectors
> Reporter: Van Halbert
> Assignee: Kylin Soong
> Fix For: 8.12
>
>
> Amazon Elastic MapReduce
> from http://en.wikipedia.org/wiki/Amazon_Elastic_MapReduce#Amazon_Elastic_MapR...
> Elastic MapReduce (EMR)was introduced by Amazon in April 2009. Provisioning of the Hadoop cluster, running and terminating jobs, and handling data transfer between EC2 and S3 are automated by Elastic MapReduce. Apache Hive, which is built on top of Hadoop for providing data warehouse services, is also offered in Elastic MapReduce.
> ...
> In June 2012, premium options for EMR were added that replace ordinary Hadoop with MapR's M3 and M5 versions. These options provide additional capabilities over and above what the default EMR offering provides.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 4 months
[JBoss JIRA] (TEIID-2884) Support for Amazon Elastic MapReduce
by Kylin Soong (JIRA)
[ https://issues.jboss.org/browse/TEIID-2884?page=com.atlassian.jira.plugin... ]
Kylin Soong edited comment on TEIID-2884 at 8/3/15 5:31 AM:
------------------------------------------------------------
Amazon EMR is a platform that integrate all Hadoop ecosystem products to Amazon, for Big Data processing, analyzing, ETL, etc.
For Hadoop ecosystem products like HBase, Hive, Spark, Impala, etc, we already have relevant translators, also Amazon EMR have supplied some compatiable driver, downloadable image, document, etc.
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/HiveJDB...
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/impala-...
I think Teiid translator can wrok well against these products in Amazon EMR(Not test so far).
Excepts integrate Hadoop ecosystem products, Amazon EMR add some more features, like EMR can use Amazon S3 to store input data, log files, and output data.
For S3, I do wonder whether we need to develop a tranlator, what mainly stored in S3 is large picture files, vedio files and big text files, it seems meaningless to develop a JDBC based translator Further more, IMO, the work to integrate S3 to enterprise application should belong to ESB, actually there already have some implmentations:
* Camel S3 Component(http://camel.apache.org/aws-s3.html)
* Mulesoft S3 Connector(https://www.mulesoft.org/connectors/amazon-simple-storage-servi...
* RSSBus/CData S3(https://www.rssbus.com/solutions/s3/)
Any idea or advice in this issue?
was (Author: kylin):
Amazon EMR is a platform that integrate all Hadoop ecosystem products to Amazon, for Big Data processing, analyzing, ETL, etc.
For Hadoop ecosystem products like HBase, Hive, Spark, Impala, etc, we already have relevant translators, also Amazon EMR have supplied some compatiable driver, downloadable image, document, etc.
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/HiveJDB...
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/impala-...
I think Teiid translator can wrok well against these products in Amazon EMR(Not test so far).
Excepts integrate Hadoop ecosystem products, Amazon EMR add some more features, like EMR can use Amazon S3 to store input data, log files, and output data.
For S3, I do wonder whether we need to develop a tranlator, what mainly stored in S3 is large picture files, vedio files and big text files, it seems meaningless to develop a JDBC based translator on top it. Further more, IMO, the work to integrate S3 to enterprise application should belong to ESB, actually there already have some implmentations:
* Camel S3 Component(http://camel.apache.org/aws-s3.html)
* Mulesoft S3 Connector(https://www.mulesoft.org/connectors/amazon-simple-storage-servi...
* RSSBus/CData S3(https://www.rssbus.com/solutions/s3/)
Any idea or advice in this issue?
> Support for Amazon Elastic MapReduce
> ------------------------------------
>
> Key: TEIID-2884
> URL: https://issues.jboss.org/browse/TEIID-2884
> Project: Teiid
> Issue Type: Feature Request
> Components: Misc. Connectors
> Reporter: Van Halbert
> Assignee: Kylin Soong
> Fix For: 8.12
>
>
> Amazon Elastic MapReduce
> from http://en.wikipedia.org/wiki/Amazon_Elastic_MapReduce#Amazon_Elastic_MapR...
> Elastic MapReduce (EMR)was introduced by Amazon in April 2009. Provisioning of the Hadoop cluster, running and terminating jobs, and handling data transfer between EC2 and S3 are automated by Elastic MapReduce. Apache Hive, which is built on top of Hadoop for providing data warehouse services, is also offered in Elastic MapReduce.
> ...
> In June 2012, premium options for EMR were added that replace ordinary Hadoop with MapR's M3 and M5 versions. These options provide additional capabilities over and above what the default EMR offering provides.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 4 months
[JBoss JIRA] (TEIID-2884) Support for Amazon Elastic MapReduce
by Kylin Soong (JIRA)
[ https://issues.jboss.org/browse/TEIID-2884?page=com.atlassian.jira.plugin... ]
Kylin Soong edited comment on TEIID-2884 at 8/3/15 5:30 AM:
------------------------------------------------------------
Amazon EMR is a platform that integrate all Hadoop ecosystem products to Amazon, for Big Data processing, analyzing, ETL, etc.
For Hadoop ecosystem products like HBase, Hive, Spark, Impala, etc, we already have relevant translators, also Amazon EMR have supplied some compatiable driver, downloadable image, document, etc.
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/HiveJDB...
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/impala-...
I think Teiid translator can wrok well against these products in Amazon EMR(Not test so far).
Excepts integrate Hadoop ecosystem products, Amazon EMR add some more features, like EMR can use Amazon S3 to store input data, log files, and output data.
For S3, I do wonder whether we need to develop a tranlator, what mainly stored in S3 is large picture files, vedio files and big text files, it seems meaningless to develop a JDBC based translator on top it. Further more, IMO, the work to integrate S3 to enterprise application should belong to ESB, actually there already have some implmentations:
* Camel S3 Component(http://camel.apache.org/aws-s3.html)
* Mulesoft S3 Connector(https://www.mulesoft.org/connectors/amazon-simple-storage-servi...
* RSSBus/CData S3(https://www.rssbus.com/solutions/s3/)
Any idea or advice in this issue?
was (Author: kylin):
Amazon EMR is a platform that integrate all Hadoop ecosystem products to Amazon, for Big Data processing, analyzing, ETL, etc.
For Hadoop ecosystem products like HBase, Hive, Spark, Impala, etc, we already have relevant translatore, also Amazon EMR have supplied some compatiable driver, downloadable image, document, etc.
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/HiveJDB...
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/impala-...
I think Teiid translator can wrok well against these products in Amazon EMR(Not test so far).
Excepts integrate Hadoop ecosystem products, Amazon EMR add some more features, like EMR can use Amazon S3 to store input data, log files, and output data.
For S3, I do wonder whether we need to develop a tranlator, what mainly stored in S3 is large picture files, vedio files and big text files, it seems meaningless to develop a JDBC based translator on top it. Further more, IMO, the work to integrate S3 to enterprise application should belong to ESB, actually there already have some implmentations:
* Camel S3 Component(http://camel.apache.org/aws-s3.html)
* Mulesoft S3 Connector(https://www.mulesoft.org/connectors/amazon-simple-storage-servi...
* RSSBus/CData S3(https://www.rssbus.com/solutions/s3/)
Any idea or advice in this issue?
> Support for Amazon Elastic MapReduce
> ------------------------------------
>
> Key: TEIID-2884
> URL: https://issues.jboss.org/browse/TEIID-2884
> Project: Teiid
> Issue Type: Feature Request
> Components: Misc. Connectors
> Reporter: Van Halbert
> Assignee: Kylin Soong
> Fix For: 8.12
>
>
> Amazon Elastic MapReduce
> from http://en.wikipedia.org/wiki/Amazon_Elastic_MapReduce#Amazon_Elastic_MapR...
> Elastic MapReduce (EMR)was introduced by Amazon in April 2009. Provisioning of the Hadoop cluster, running and terminating jobs, and handling data transfer between EC2 and S3 are automated by Elastic MapReduce. Apache Hive, which is built on top of Hadoop for providing data warehouse services, is also offered in Elastic MapReduce.
> ...
> In June 2012, premium options for EMR were added that replace ordinary Hadoop with MapR's M3 and M5 versions. These options provide additional capabilities over and above what the default EMR offering provides.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 4 months
[JBoss JIRA] (TEIID-2884) Support for Amazon Elastic MapReduce
by Kylin Soong (JIRA)
[ https://issues.jboss.org/browse/TEIID-2884?page=com.atlassian.jira.plugin... ]
Kylin Soong commented on TEIID-2884:
------------------------------------
Amazon EMR is a platform that integrate all Hadoop ecosystem products to Amazon, for Big Data processing, analyzing, ETL, etc.
For Hadoop ecosystem products like HBase, Hive, Spark, Impala, etc, we already have relevant translatore, also Amazon EMR have supplied some compatiable driver, downloadable image, document, etc.
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/HiveJDB...
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/impala-...
I think Teiid translator can wrok well against these products in Amazon EMR(Not test so far).
Excepts integrate Hadoop ecosystem products, Amazon EMR add some more features, like EMR can use Amazon S3 to store input data, log files, and output data.
For S3, I do wonder whether we need to develop a tranlator, what mainly stored in S3 is large picture files, vedio files and big text files, it seems meaningless to develop a JDBC based translator on top it. Further more, IMO, the work to integrate S3 to enterprise application should belong to ESB, actually there already have some implmentations:
* Camel S3 Component(http://camel.apache.org/aws-s3.html)
* Mulesoft S3 Connector(https://www.mulesoft.org/connectors/amazon-simple-storage-servi...
* RSSBus/CData S3(https://www.rssbus.com/solutions/s3/)
Any idea or advice in this issue?
> Support for Amazon Elastic MapReduce
> ------------------------------------
>
> Key: TEIID-2884
> URL: https://issues.jboss.org/browse/TEIID-2884
> Project: Teiid
> Issue Type: Feature Request
> Components: Misc. Connectors
> Reporter: Van Halbert
> Assignee: Kylin Soong
> Fix For: 8.12
>
>
> Amazon Elastic MapReduce
> from http://en.wikipedia.org/wiki/Amazon_Elastic_MapReduce#Amazon_Elastic_MapR...
> Elastic MapReduce (EMR)was introduced by Amazon in April 2009. Provisioning of the Hadoop cluster, running and terminating jobs, and handling data transfer between EC2 and S3 are automated by Elastic MapReduce. Apache Hive, which is built on top of Hadoop for providing data warehouse services, is also offered in Elastic MapReduce.
> ...
> In June 2012, premium options for EMR were added that replace ordinary Hadoop with MapR's M3 and M5 versions. These options provide additional capabilities over and above what the default EMR offering provides.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 4 months