On 25 Jan 2012, at 13:25, Sanne Grinovero wrote:
[cut]
>> I agree, we should not ask all replicas for the same information.
>> Asking only one is the opposite though: I think this should be a
>> configuration option to ask for any value between (1 and numOwner).
>> That's because I understand it might be beneficial to ask to more than
>> one node immediately,
> why is it more beneficial to ask multiple members than a single one? I guess it
doesn't have to do with consistency, as in that case it would be required (vs
beneficial).
> Is it because one of the nodes might reply faster? I'm not that sure that
compensates the burden of numOwner-1 additional RPCs, but a benchmark will tell us just
that.
One node might be busy doing GC and stay unresponsive for a whole
second or longer, another one might be actually crashed and you didn't
know that yet, these are unlikely but possible.
All these are possible but I would
rather consider them as exceptional situations, possibly handled by a retry logic. We
should *not* optimise for that these situations IMO.
Thinking about our last performance results, we have avg 26k gets per second. Now with
numOwners = 2, these means that each node handles 26k *redundant* gets every second:
I'm not concerned about the network load, as Bela mentioned in a previous mail the
network link should not be the bottleneck, but there's a huge unnecessary activity in
OOB threads which should rather be used for releasing locks or whatever needed. On top of
that, this consuming activity highly encourages GC pauses, as the effort for a get is
practically numOwners higher than it should be.
More likely, a rehash is in progress, you could then be asking a
node
which doesn't yet (or anymore) have the value.
this is a consistency issue and I think we can find a way to handle it some other way.
All good reasons for which imho it makes sense to send out "a couple"
of requests in parallel, but I'd unlikely want to send more than 2,
and I agree often 1 might be enough.
Maybe it should even optimize for the most common case: send out just
one, have a more aggressive timeout and in case of trouble ask for the
next node.
+1
In addition, sending a single request might spare us some Future,
await+notify messing in terms of CPU cost of sending the request.
it's the
remote OOB thread that's the most costly resource imo.