[wildfly-dev] Design Proposal: Server suspend/resume (AKA Graceful Shutdown)

Stuart Douglas stuart.w.douglas at gmail.com
Tue Jun 10 08:20:42 EDT 2014


They still have an entry point, the subsystem tracks outstanding timers and will not execute new ones while suspended.

Stuart

Sent from my iPhone

> On 9 Jun 2014, at 23:46, Jaikiran Pai <jai.forums2013 at gmail.com> wrote:
> 
> This is more of a subsystem specific       question - How are internal operations (i.e. something that doesn't exactly have an entry point into the server) handled when the server is in a suspended state or when it is suspending. One such example is, an EJB application which might have scheduled timer tasks associated with it. Are such timer tasks supposed to continue to run even when server is suspended or when it is suspending? Or is the subsystem expected to shut those down too?
> 
> -Jaikiran
>> On Tuesday 10 June 2014 04:08 AM, Stuart Douglas wrote:
>> Server suspend and resume is a feature that allows a running server to 
>> gracefully finish of all running requests. The most common use case for 
>> this is graceful shutdown, where you would like a server to complete all 
>> running requests, reject any new ones, and then shut down, however there 
>> are also plenty of other valid use cases (e.g. suspend the server, 
>> modify a data source or some other config, then resume).
>> 
>> User View:
>> 
>> For the users point of view two new operations will be added to the server:
>> 
>> suspend(timeout)
>> resume()
>> 
>> A runtime only attribute suspend-state (is this a good name?) will also 
>> be added, that can take one of three possible values, RUNNING, 
>> SUSPENDING, SUSPENDED.
>> 
>> A timeout attribute will also be added to the shutdown operation. If 
>> this is present then the server will first be suspended, and the server 
>> will not shut down until either the suspend is successful or the timeout 
>> occurs. If no timeout parameter is passed to the operation then a normal 
>> non-graceful shutdown will take place.
>> 
>> In domain mode these operations will be added to both individual server 
>> and a complete server group.
>> 
>> Implementation Details
>> 
>> Suspend/resume operates on entry points to the server. Any request that 
>> is currently running must not be affected by the suspend state, however 
>> any new request should be rejected. In general subsystems will track the 
>> number of outstanding requests, and when this hits zero they are 
>> considered suspended.
>> 
>> We will introduce the notion of a global SuspendController, that manages 
>> the servers suspend state. All subsystems that wish to do a graceful 
>> shutdown register callback handlers with this controller.
>> 
>> When the suspend() operation is invoked the controller will invoke all 
>> these callbacks, letting the subsystem know that the server is suspend, 
>> and providing the subsystem with a SuspendContext object that the 
>> subsystem can then use to notify the controller that the suspend is 
>> complete.
>> 
>> What the subsystem does when it receives a suspend command, and when it 
>> considers itself suspended will vary, but in the common case it will 
>> immediatly start rejecting external requests (e.g. Undertow will start 
>> responding with a 503 to all new requests). The subsystem will also 
>> track the number of outstanding requests, and when this hits zero then 
>> the subsystem will notify the controller that is has successfully 
>> suspended.
>> Some subsystems will obviously want to do other actions on suspend, e.g. 
>> clustering will likely want to fail over, mod_cluster will notify the 
>> load balancer that the node is no longer available etc. In some cases we 
>> may want to make this configurable to an extent (e.g. Undertow could be 
>> configured to allow requests with an existing session, and not consider 
>> itself timed out until all sessions have either timed out or been 
>> invalidated, although this will obviously take a while).
>> 
>> If anyone has any feedback let me know. In terms of implementation my 
>> basic plan is to get the core functionality and the Undertow 
>> implementation into Wildfly, and then work with subsystem authors to 
>> implement subsystem specific functionality once the core is in place.
>> 
>> Stuart
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> The
>> 
>> A timeout attribute will also be added to the shutdown command,
>> _______________________________________________
>> wildfly-dev mailing list
>> wildfly-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/wildfly-dev
> 
> _______________________________________________
> wildfly-dev mailing list
> wildfly-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/wildfly-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/wildfly-dev/attachments/20140610/a59255e7/attachment-0001.html 


More information about the wildfly-dev mailing list