Hello,
It doesn't look like what you have currently is safe for
rehashes
though since the owners would change nodes. You would need to move
the listener between nodes in this case. Also you removed an edge
case when a listener might not be installed if a CH change occurs
right when sending to nodes (talked about later).
Indeed, this modification is not
safe in presence of cluster changes. In
fact, I missed that the current implementation was ensuring elasticity.
The problem was that there is a overlap when you have a node joining
while you are sending the initial requests that it wouldn't install
the listener.
Cluster -> Node A, B, C
1. User installs listener on Node C
2. Node C is sending listeners to Nodes A + B
3. Node D joins in the time in between and asks for the listener (from
coordinator), but it isn't fully installed yet to be retrieved
4. Node C finishes installing listeners on Nodes A + B
in this case Node D never would have gotten listener, so Node C also
sees if anyone else has joined.
I understood this was necessary for
cache.addListener() atomicity, but I
though erroneously that elasticity was not implemented (I also needed a
quick fix). In my view, and please correct me if I am wrong, the
architecture you describe has still an issue because the coordinator can
fail. It is in fact necessary to re-execute the installation code until
a stable view is obtained (twice the same). Going back to your example,
consider a step 5 where the coordinator, say A, fails at the time C is
installing the listener on D, and some node E is joining. In case D is
the newly elected coordinator, E will never retrieve the listener. What
do you think of this scenario ?
The difference is that Node D only sends 1 message to coordinator to
ask for listeners instead of sending N # of messages to all nodes
(which would be required on every JOIN from any node). This should
scale better in the long run especially since most cases this
shouldn't happen.
In fact, D needs retrieving filters from the nodes that where
holding
the keys it is in charge now to replicate. Indeed this requires at first
glance to send a JOIN message to all nodes. If you believe that this
key-based (or key range based) listener functionality is a direction of
interest, I can try amending my code to ensure elasticity.
Pierre