On 10 Oct 2014, at 18:06, Dan Berindei <dan.berindei(a)gmail.com> wrote:
The biggest downside I see is that it would be horribly slow if the cache store
doesn't support efficient iteration of a single segment. So we might want to implement
a full retry strategy as well, if some cache stores can't support that.
My understanding from a discussion with Pedro (in a hard, cold and sinister place but
that’s another story) is that *today* M/R is kinda horrible for global cache stores
anyways that have to do the key per node filtering dance anyways. So it’s not
significantly worse.
Plus I said we should do work per segment but in reality if you send 5 Map segment work to
the same node, you can optimize and do a single loop only making it feel like they are
separated work.