Ok, I've read the code now. (It's been a long time!)
The partial updates should only be sent if there is a mismatch between the
current CH and the topology cache (i.e. one of the owners in the CH doesn't
have an endpoint address in the topology cache) and the client has a really
old CH (i.e. client topology id + 1 < server topology id, e.g. because this
is the client's first request). in this case, we send a topology update to
the client, even though we know it will be updated soon, but the server
must prune all the owners without a valid endpoint address from the CH sent
to the client (as per your second proposal).
Cheers
Dan
On Thu, Aug 28, 2014 at 3:31 PM, Dan Berindei <dan.berindei(a)gmail.com>
wrote:
Do we really need to send those partial topology updates? What
topology id
do they have?
When the coordinator sees the leaver, it updates the consistent hashes on
all the members and increases the cache topology id. Normally this is
immediately followed by a new topology update that starts a rebalance, but
if there is just one node left in the cluster there is nothing to rebalance
and this will be the last topology sent to the client. If we already sent a
partial topology to the client with that id, we'll never update the CH on
the client.
Cheers
Dan
On Thu, Aug 28, 2014 at 3:20 PM, Galder Zamarreño <galder(a)redhat.com>
wrote:
> Hey Dan,
>
> Re:
https://issues.jboss.org/browse/ISPN-4674
>
> If you remember, the topology updates that we send to clients are
> sometimes partial. This happens when at the JGroups level we have a new
> view, but the HR address cache has not yet been updated with the JGroups
> address to endpoint address. This logic works well with HR protocol 1.x.
>
> With HR 2.x, there’s a slight problem with this. The problem is that we
> now write segment information in the topology, and when we have this
> partial set up, calls to locateOwnersForSegment(), for a partial cluster of
> 2, it can quite possibly return 2.
>
> The problem comes when the client reads the number of servers, discovers
> it’s one, but reading the segment, it says that there’s two owners. That’s
> where the ArrayIndexOutOfBoundsException comes from.
>
> The question is: how shall we deal with this segment information in the
> even of a partial topology update?
>
> >From a client perspective, one option might be to just ignore those
> segment positions for which there’s no cluster member. IOW, if the number
> of owners is bigger than the cluster view, it could just decide to create a
> smaller segment array, of only cluster view size, and then ignore the index
> of a node that’s not present in the cluster view.
>
> Would this be the best way to solve it? Or could we just avoid sending
> segment information that’s not right? IOW, directly send from the server
> segment information with all this filtered.
>
> Thoughts?
>
> Cheers,
> --
> Galder Zamarreño
> galder(a)redhat.com
>
twitter.com/galderz
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>