[wildfly-dev] Design Proposal: Server suspend/resume (AKA Graceful Shutdown)

Stuart Douglas stuart.w.douglas at gmail.com
Tue Jun 10 11:17:52 EDT 2014


They are actually orthogonal, a server can be in both RESTART_REQUIRED 
and any one of the suspend states.

RESTART_REQUIRED is very much tied to services and the management model, 
while suspend/resume is a runtime only thing that should not touch the 
state of services.


Stuart

Dimitris Andreadis wrote:
> Why not extend the states of the existing 'server-state' attribute to:
>
> (STARTING, RUNNING, SUSPENDING, SUSPENDED, RESTART_REQUIRED RUNNING)
>
> http://wildscribe.github.io/Wildfly/8.0.0.Final/index.html
>
> On 10/06/2014 04:40, Stuart Douglas wrote:
>>
>> Scott Marlow wrote:
>>> On 06/09/2014 06:38 PM, Stuart Douglas wrote:
>>>> Server suspend and resume is a feature that allows a running server to
>>>> gracefully finish of all running requests. The most common use case for
>>>> this is graceful shutdown, where you would like a server to complete all
>>>> running requests, reject any new ones, and then shut down, however there
>>>> are also plenty of other valid use cases (e.g. suspend the server,
>>>> modify a data source or some other config, then resume).
>>>>
>>>> User View:
>>>>
>>>> For the users point of view two new operations will be added to the server:
>>>>
>>>> suspend(timeout)
>>>> resume()
>>>>
>>>> A runtime only attribute suspend-state (is this a good name?) will also
>>>> be added, that can take one of three possible values, RUNNING,
>>>> SUSPENDING, SUSPENDED.
>>> The SuspendController "state" might be a shorter attribute name and just
>>> as meaningful.
>> This will be in the global server namespace (i.e. from the CLI
>> :read-attribute(name="suspend-state").
>>
>> I think the name 'state' is just two generic, which kind of state are we
>> talking about?
>>
>>> When are we in the RUNNING state?  Is that simply the pre-state for
>>> SUSPENDING?
>> 99.99% of the time. Basically servers are always running unless they are
>> have been explicitly suspended, and then they go from suspending to
>> suspended. Note that if resume is called at any time the server goes to
>> RUNNING again immediately, as when subsystems are notified they should
>> be able to begin accepting requests again straight away.
>>
>> We also have admin only mode, which is a kinda similar concept, so we
>> need to make sure we document the differences.
>>
>>>> A timeout attribute will also be added to the shutdown operation. If
>>>> this is present then the server will first be suspended, and the server
>>>> will not shut down until either the suspend is successful or the timeout
>>>> occurs. If no timeout parameter is passed to the operation then a normal
>>>> non-graceful shutdown will take place.
>>> Will non-graceful shutdown wait for non-daemon threads or terminate
>>> immediately (call System.exit()).
>> It will execute the same way it does today (all services will shut down
>> and then the server will exit).
>>
>> Stuart
>>
>>>> In domain mode these operations will be added to both individual server
>>>> and a complete server group.
>>>>
>>>> Implementation Details
>>>>
>>>> Suspend/resume operates on entry points to the server. Any request that
>>>> is currently running must not be affected by the suspend state, however
>>>> any new request should be rejected. In general subsystems will track the
>>>> number of outstanding requests, and when this hits zero they are
>>>> considered suspended.
>>>>
>>>> We will introduce the notion of a global SuspendController, that manages
>>>> the servers suspend state. All subsystems that wish to do a graceful
>>>> shutdown register callback handlers with this controller.
>>>>
>>>> When the suspend() operation is invoked the controller will invoke all
>>>> these callbacks, letting the subsystem know that the server is suspend,
>>>> and providing the subsystem with a SuspendContext object that the
>>>> subsystem can then use to notify the controller that the suspend is
>>>> complete.
>>>>
>>>> What the subsystem does when it receives a suspend command, and when it
>>>> considers itself suspended will vary, but in the common case it will
>>>> immediatly start rejecting external requests (e.g. Undertow will start
>>>> responding with a 503 to all new requests). The subsystem will also
>>>> track the number of outstanding requests, and when this hits zero then
>>>> the subsystem will notify the controller that is has successfully
>>>> suspended.
>>>> Some subsystems will obviously want to do other actions on suspend, e.g.
>>>> clustering will likely want to fail over, mod_cluster will notify the
>>>> load balancer that the node is no longer available etc. In some cases we
>>>> may want to make this configurable to an extent (e.g. Undertow could be
>>>> configured to allow requests with an existing session, and not consider
>>>> itself timed out until all sessions have either timed out or been
>>>> invalidated, although this will obviously take a while).
>>>>
>>>> If anyone has any feedback let me know. In terms of implementation my
>>>> basic plan is to get the core functionality and the Undertow
>>>> implementation into Wildfly, and then work with subsystem authors to
>>>> implement subsystem specific functionality once the core is in place.
>>>>
>>>> Stuart
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> The
>>>>
>>>> A timeout attribute will also be added to the shutdown command,
>>>> _______________________________________________
>>>> wildfly-dev mailing list
>>>> wildfly-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/wildfly-dev
>>>>
>>> _______________________________________________
>>> wildfly-dev mailing list
>>> wildfly-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/wildfly-dev
>> _______________________________________________
>> wildfly-dev mailing list
>> wildfly-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/wildfly-dev
>>
> _______________________________________________
> wildfly-dev mailing list
> wildfly-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/wildfly-dev


More information about the wildfly-dev mailing list