[jboss-as7-dev] Clustered invocation design

Fri Oct 14 05:05:44 EDT 2011

On Oct 13, 2011, at 5:57 PM, David M. Lloyd wrote:

> On 10/13/2011 10:45 AM, Paul Ferraro wrote:
>> On Tue, 2011-10-11 at 10:59 -0500, David M. Lloyd wrote:
>>> There are at least two basic paths we can follow for clustered
>>> invocation based on the current architecture.  The right choice is going
>>> to depend primarily upon the expected use cases, which I am not in a
>>> position to properly judge.
>>> 
>>> Option 1: Clustered Invocation Transport
>>> ----------------------------------------
>>> 
>>> In this option, we introduce a new "LAN" transport type for invocation
>>> on the cluster.  The transport would use direct TCP connections or UDP
>>> messages (or both, depending on request size) to convey the invocation.
>>>   The characteristics of this option are as follows:
>>> 
>>> - Security: reliance on physical network security only (no TLS or
>>> authentication)
>>> - Latency is very low, even to new nodes
>>> - Topology changes can be conveyed as separate asynchronous messages
>>> - Invocations from external networks would happen through a proxy node,
>>> with Remoting being bridged to the LAN, to perform security functions
>>> 
>>> Option 2: Load-balanced Remoting Connections
>>> --------------------------------------------
>>> 
>>> In this option, we rely on the client to establish one or more Remoting
>>> connection(s) to one or more of the nodes of the cluster.  Logic in the
>>> client will be used to determine what connection(s) to use for what
>>> clusters.  We have the option of automatically connecting as topology
>>> changes or requiring the user to set up the connections in advance.
>>> Note that automatic connection cannot work in the case of
>>> user-interactive authentication.  Characteristics:
>>> 
>>> - Security: full authentication and TLS supported
>>> - Latency is low once the connection is established, however there is
>>> some overhead involved in authentication and security negotiation
>>> - Topology changes should be asynchronous notifications
>>> - Each connection has to be separately authenticated
>>> - Automatically establishing connections is not presently supported, so
>>> we'd need a bit of infrastructure for that.  Deal with user-interactive
>>> authentication.  Deal with connection lifecycle management.  Deal with
>>> configuration.  This will be a point of fragility
>>> 
>>> Summary
>>> -------
>>> 
>>> For both options, we have to determine an appropriate load-balancing
>>> strategy.  The choice of direction will affect how our clustering and
>>> transaction interceptors function.  We also have to suss out the logic
>>> around dealing with conflicting or wrongly-ordered topology updates;
>>> hopefully our existing policies will continue to apply.
>> 
>> Do topology changes really need to be asynchronous notifications?  Can
>> we simply update cluster topology per invocation?
>> 
>> Maintaining an accurate cluster topology via asynchronous notifications
>> has the following benefits:
>> 1. Topology changes between invocations won't require failover in the
>> event of a load balanced invocation (as opposed to a sticky one).
>> * Load balancing will potentially be more effective following topology
>> changes by leveraging new cluster members.
>> * Minimizes invocation payload (since we don't need to tack on cluster
>> topology to every invocation response).  We can optimize this somewhat
>> by sending a topology view ID with the invocation request, and only
>> including the topology in the response if the topology changed (i.e.
>> request view ID != current view ID).

As a side note, this is what Hot Rod does in Infinispan in order to detect stale views. 

We might optimise this further in the future as indicated in https://issues.jboss.org/browse/ISPN-1403, but this optimisation is particular to the Infinispan use case. 

Alternative optimisation avenues could be investigated for clustered invocations. 

>> 
>> The only disadvantage of which I can think is implementation complexity.
>> Topology update ordering is not an issue if we take the simpler
>> approach.  However, we can also make an assumption that topology changes
>> are not common - so it becomes a matter of whether or not to optimize
>> for frequent topology changes.
> 
> The problem is that (with R3 anyway) many threads may concurrently use a 
> single connection, and invocation replies can come in any order with 
> respect to the original invocation, so in effect even if we attach 
> topology information to the reply they're still essentially asynchronous 
> with the disadvantage that topology changes also bog down invocation 
> response times.
> 
> And if we did a non-persistent-connection-based transport, there's even 
> less of a guarantee because each reply could come in separate packets or 
> connections which can be arbitrary reordered at a network level.
> 
> In other words, topology update ordering is always an issue, even more 
> so when multiple nodes come into play.
> 
> Using a view ID is fine as long as all nodes in the cluster always agree 
> on what view ID is the latest (which afaik is essentially impossible to 
> guarantee).

Well, if they're running in a cluster, JGroups provides a guarantee via GMS that a common viewId is maintained, at least until a cluster partitition occurs. 

When a partition occurs, several cluster islands can evolve their viewId independently.

> 
> But again all this ties back to the transport implementation.  R3 
> transport means persistent connections but we likely cannot 
> automatically bring up new connections to new nodes; custom transport 
> would mean no persistent connections but new nodes can be accessed 
> instantly.
> -- 
> - DML
> _______________________________________________
> jboss-as7-dev mailing list
> jboss-as7-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/jboss-as7-dev

--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache