On Tue, 2011-10-11 at 10:59 -0500, David M. Lloyd wrote:
There are at least two basic paths we can follow for clustered
invocation based on the current architecture. The right choice is going
to depend primarily upon the expected use cases, which I am not in a
position to properly judge.
Option 1: Clustered Invocation Transport
----------------------------------------
In this option, we introduce a new "LAN" transport type for invocation
on the cluster. The transport would use direct TCP connections or UDP
messages (or both, depending on request size) to convey the invocation.
The characteristics of this option are as follows:
- Security: reliance on physical network security only (no TLS or
authentication)
- Latency is very low, even to new nodes
- Topology changes can be conveyed as separate asynchronous messages
- Invocations from external networks would happen through a proxy node,
with Remoting being bridged to the LAN, to perform security functions
Option 2: Load-balanced Remoting Connections
--------------------------------------------
In this option, we rely on the client to establish one or more Remoting
connection(s) to one or more of the nodes of the cluster. Logic in the
client will be used to determine what connection(s) to use for what
clusters. We have the option of automatically connecting as topology
changes or requiring the user to set up the connections in advance.
Note that automatic connection cannot work in the case of
user-interactive authentication. Characteristics:
- Security: full authentication and TLS supported
- Latency is low once the connection is established, however there is
some overhead involved in authentication and security negotiation
- Topology changes should be asynchronous notifications
- Each connection has to be separately authenticated
- Automatically establishing connections is not presently supported, so
we'd need a bit of infrastructure for that. Deal with user-interactive
authentication. Deal with connection lifecycle management. Deal with
configuration. This will be a point of fragility
Summary
-------
For both options, we have to determine an appropriate load-balancing
strategy. The choice of direction will affect how our clustering and
transaction interceptors function. We also have to suss out the logic
around dealing with conflicting or wrongly-ordered topology updates;
hopefully our existing policies will continue to apply.
Do topology changes really need to be asynchronous notifications? Can
we simply update cluster topology per invocation?
Maintaining an accurate cluster topology via asynchronous notifications
has the following benefits:
1. Topology changes between invocations won't require failover in the
event of a load balanced invocation (as opposed to a sticky one).
* Load balancing will potentially be more effective following topology
changes by leveraging new cluster members.
* Minimizes invocation payload (since we don't need to tack on cluster
topology to every invocation response). We can optimize this somewhat
by sending a topology view ID with the invocation request, and only
including the topology in the response if the topology changed (i.e.
request view ID != current view ID).
The only disadvantage of which I can think is implementation complexity.
Topology update ordering is not an issue if we take the simpler
approach. However, we can also make an assumption that topology changes
are not common - so it becomes a matter of whether or not to optimize
for frequent topology changes.