Pedro/Gustavo,
How do you plan to benchmark our Hadoop implementation? It seems
TeraSort benchmark suite is an interesting option. Maybe not using
1 TB data set right away, but eventually, why not? Especially now
that we can easily run 500 nodes cluster on GCE. I would love to
see if we can, when you guys start benchmarking our Hadoop impl,
give TeraSort a run on a regular Map/Reduce implementation as
well.
What do you think?
Vladimir
[1]
http://www.michael-noll.com/blog/2011/04/09/benchmarking-and-stress-testing-an-hadoop-cluster-with-terasort-testdfsio-nnbench-mrbench/#terasort-benchmark-suite