On Mon, Jun 30, 2014 at 3:47 PM, Vladimir Blagojevic <vblagoje@redhat.com> wrote:

On 2014-06-26, 10:12 AM, Galder Zamarreño wrote:
> On 23 Jun 2014, at 11:04, Gustavo Fernandes <gustavonalle@gmail.com> wrote:
>
>> - I read with great interest the Spark paper [9]. Spark provides a DSL with functional language constructs like map, flatMap and filter to process distributed data in memory. In this scenario, Map Reduce is just a special case achieved by chaining functions [10]. As Spark is much more than Map Reduce, and can run many machine learning algorithms efficiently, I was wondering if we should shift attention to Spark rather than focusing too much on Map Reduce. Thoughts?
> I’m not an expert on these topics, but I like the look and the approach of Spark :). The fact that it’s not tight to a single paradigm is particularly interesting, and secondly, the fact that it’s tries to make the most out of functional constructs, which seem to provide more elegant ways of dealing with data.
>
>

Gustavo thanks for your email and the references. I like Spark as well!
I read the Spark paper over the weekend, definitely not an easy digest
and I will continue to read about this topic but this seems to be the
direction we should steer ourselves - data analytics platform!

As for Hadoop implementation not sure that it make sense to
implement/support Hadoop v1.x unless it is super easy and low
maintenance. How hard would it be to implement YARN?

From Map Reduce perspective, v2 is binary compatible with v1, so the same jar containing the job can run on both Map Reduce 1.x and YARN Map Reduce.

It also should be straightforward to support YARN API directly as well

Gustavo

Regards,
Vladimir

_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev