[infinispan-dev] DIST.retrieveFromRemoteSource

Wed Jan 25 03:51:39 EST 2012

Hi Sanne

On Wed, Jan 25, 2012 at 1:22 AM, Sanne Grinovero <sanne at infinispan.org> wrote:
> Hello,
> in the method:
> org.infinispan.distribution.DistributionManagerImpl.retrieveFromRemoteSource(Object,
> InvocationContext, boolean)
>
> we have:
>
>      List<Address> targets = locate(key);
>      // if any of the recipients has left the cluster since the
> command was issued, just don't wait for its response
>      targets.retainAll(rpcManager.getTransport().getMembers());
>
> But then then we use ResponseMode.WAIT_FOR_VALID_RESPONSE, which means
> we're not going to wait for all responses anyway, and I think we might
> assume to get a reply by a node which actually is in the cluster.
>
> So the retainAll method is unneeded and can be removed? I'm wondering,
> because it's not safe anyway, actually it seems very unlikely to me
> that just between a locate(key) and the retainAll the view is being
> changed, so not something we should be relying on anyway.
> I'd rather assume that such a get method might be checked and
> eventually dropped by the receiver.
>

The locate method will return a list of owners based on the
"committed" cache view, so there is a non-zero probability that one of
the owners has already left.

If I remember correctly, I added the retainAll call because otherwise
ClusteredGetResponseFilter.needMoreResponses() would keep returning
true if one of the targets left the cluster. Coupled with the fact
that null responses are not considered valid (unless *all* responses
are null), this meant that a remote get for a key that doesn't exist
would throw a TimeoutException after 15 seconds instead of returning
null immediately.

We could revisit the decision to make null responses invalid, and then
as long as there is still one of the old owners left in the cluster
you'll get the null result immediately. You may still get an exception
if all the old owners left the cluster, but I'm not sure. I wish I had
added a test for this...

We may also be able to add a workaround in FutureCollator as well -
just remember that we use the same FutureCollator for writes in REPL
mode so it needs to work with GET_ALL as well as with GET_FIRST.

Slightly related, I wonder if Manik's comment is still true:

    if at all possible, try not to use JGroups' ANYCAST for now.
Multiple (parallel) UNICASTs are much faster.)

Intuitively it shouldn't be true, unicasts+FutureCollator do basically
the same thing as anycast+GroupRequest.

Cheers
Dan

> Cheers,
> Sanne
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev