[teiid-issues] [JBoss JIRA] (TEIID-3442) Apache Spark support via SparkSQL and DataFrames

Fri Apr 24 12:20:52 EDT 2015

    [ https://issues.jboss.org/browse/TEIID-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13062556#comment-13062556 ] 

Ramesh Reddy commented on TEIID-3442:
-------------------------------------

[~blue666man] is this something you have bandwidth to contribute to Teiid? For me, it seems like using the Thrift JDBC driver is good idea. Also, want to make sure you are primarily intending to use as source?

> Apache Spark support via SparkSQL and DataFrames
> ------------------------------------------------
>
>                 Key: TEIID-3442
>                 URL: https://issues.jboss.org/browse/TEIID-3442
>             Project: Teiid
>          Issue Type: Feature Request
>          Components: Misc. Connectors
>    Affects Versions: 8.10
>            Reporter: John Muller
>              Labels: Connectors, Spark, Translators
>             Fix For: Open To Community
>
>   Original Estimate: 20 weeks
>  Remaining Estimate: 20 weeks
>
> Eliciting comments for Apache Spark support.  With the release of Panda's like DataFrames, it is a little more feasible to directly translate to SparkSQL:
> https://spark.apache.org/docs/latest/sql-programming-guide.html
> Options in order of complexity:
> 1. Use the existing Hive connector / translator.  Spark still uses the Hive metastore.
> 2. Thrift JDBC driver.  This is what Microstrategy, Tableau, QlikView and others use, most rudimentary API for accessing Spark.
> 3. Native SparkSQL via building Spark jobs and submitting them to a running Spark driver. 

--
This message was sent by Atlassian JIRA
(v6.3.15#6346)