Pedro/Gustavo,
How do you plan to benchmark our Hadoop implementation? It seems
TeraSort benchmark suite is an interesting option. Maybe not using 1 TB
data set right away, but eventually, why not? Especially now that we can
easily run 500 nodes cluster on GCE. I would love to see if we can, when
you guys start benchmarking our Hadoop impl, give TeraSort a run on a
regular Map/Reduce implementation as well.
What do you think?
Vladimir
[1]
http://www.michael-noll.com/blog/2011/04/09/benchmarking-and-stress-testi...