[wildfly-dev] Design Proposal: Server suspend/resume (AKA Graceful Shutdown)

Jaikiran Pai jai.forums2013 at gmail.com
Tue Jun 10 01:02:02 EDT 2014


One other question - When a server is put into suspend mode, is it going 
to trigger undeployment of certain deployed deployments? And would that 
be considered the expected behaviour? More specifically, when the admin 
triggers a suspend, are the subsystems allowed to trigger certain 
operations which might stop the services that back the currently 
deployed deployments or are they expected to keep those services in a 
started/UP state?

-Jaikiran
On Tuesday 10 June 2014 10:16 AM, Jaikiran Pai wrote:
> This is more of a subsystem specific question - How are internal 
> operations (i.e. something that doesn't exactly have an entry point 
> /into/ the server) handled when the server is in a suspended state or 
> when it is suspending. One such example is, an EJB application which 
> might have scheduled timer tasks associated with it. Are such timer 
> tasks supposed to continue to run even when server is suspended or 
> when it is suspending? Or is the subsystem expected to shut those down 
> too?
>
> -Jaikiran
> On Tuesday 10 June 2014 04:08 AM, Stuart Douglas wrote:
>> Server suspend and resume is a feature that allows a running server to
>> gracefully finish of all running requests. The most common use case for
>> this is graceful shutdown, where you would like a server to complete all
>> running requests, reject any new ones, and then shut down, however there
>> are also plenty of other valid use cases (e.g. suspend the server,
>> modify a data source or some other config, then resume).
>>
>> User View:
>>
>> For the users point of view two new operations will be added to the server:
>>
>> suspend(timeout)
>> resume()
>>
>> A runtime only attribute suspend-state (is this a good name?) will also
>> be added, that can take one of three possible values, RUNNING,
>> SUSPENDING, SUSPENDED.
>>
>> A timeout attribute will also be added to the shutdown operation. If
>> this is present then the server will first be suspended, and the server
>> will not shut down until either the suspend is successful or the timeout
>> occurs. If no timeout parameter is passed to the operation then a normal
>> non-graceful shutdown will take place.
>>
>> In domain mode these operations will be added to both individual server
>> and a complete server group.
>>
>> Implementation Details
>>
>> Suspend/resume operates on entry points to the server. Any request that
>> is currently running must not be affected by the suspend state, however
>> any new request should be rejected. In general subsystems will track the
>> number of outstanding requests, and when this hits zero they are
>> considered suspended.
>>
>> We will introduce the notion of a global SuspendController, that manages
>> the servers suspend state. All subsystems that wish to do a graceful
>> shutdown register callback handlers with this controller.
>>
>> When the suspend() operation is invoked the controller will invoke all
>> these callbacks, letting the subsystem know that the server is suspend,
>> and providing the subsystem with a SuspendContext object that the
>> subsystem can then use to notify the controller that the suspend is
>> complete.
>>
>> What the subsystem does when it receives a suspend command, and when it
>> considers itself suspended will vary, but in the common case it will
>> immediatly start rejecting external requests (e.g. Undertow will start
>> responding with a 503 to all new requests). The subsystem will also
>> track the number of outstanding requests, and when this hits zero then
>> the subsystem will notify the controller that is has successfully
>> suspended.
>> Some subsystems will obviously want to do other actions on suspend, e.g.
>> clustering will likely want to fail over, mod_cluster will notify the
>> load balancer that the node is no longer available etc. In some cases we
>> may want to make this configurable to an extent (e.g. Undertow could be
>> configured to allow requests with an existing session, and not consider
>> itself timed out until all sessions have either timed out or been
>> invalidated, although this will obviously take a while).
>>
>> If anyone has any feedback let me know. In terms of implementation my
>> basic plan is to get the core functionality and the Undertow
>> implementation into Wildfly, and then work with subsystem authors to
>> implement subsystem specific functionality once the core is in place.
>>
>> Stuart
>>
>>
>>
>>
>>
>>
>>
>> The
>>
>> A timeout attribute will also be added to the shutdown command,
>> _______________________________________________
>> wildfly-dev mailing list
>> wildfly-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/wildfly-dev
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/wildfly-dev/attachments/20140610/c4c89c80/attachment.html 


More information about the wildfly-dev mailing list