[wildfly-dev] Design Proposal: Server suspend/resume (AKA Graceful Shutdown)

Dimitris Andreadis dandread at redhat.com
Tue Jun 10 11:32:46 EDT 2014


Isn't RESTART_REQUIRED also orthogonal to RUNNING?

On 10/06/2014 17:17, Stuart Douglas wrote:
> They are actually orthogonal, a server can be in both RESTART_REQUIRED and any one of the
> suspend states.
>
> RESTART_REQUIRED is very much tied to services and the management model, while
> suspend/resume is a runtime only thing that should not touch the state of services.
>
>
> Stuart
>
> Dimitris Andreadis wrote:
>> Why not extend the states of the existing 'server-state' attribute to:
>>
>> (STARTING, RUNNING, SUSPENDING, SUSPENDED, RESTART_REQUIRED RUNNING)
>>
>> http://wildscribe.github.io/Wildfly/8.0.0.Final/index.html
>>
>> On 10/06/2014 04:40, Stuart Douglas wrote:
>>>
>>> Scott Marlow wrote:
>>>> On 06/09/2014 06:38 PM, Stuart Douglas wrote:
>>>>> Server suspend and resume is a feature that allows a running server to
>>>>> gracefully finish of all running requests. The most common use case for
>>>>> this is graceful shutdown, where you would like a server to complete all
>>>>> running requests, reject any new ones, and then shut down, however there
>>>>> are also plenty of other valid use cases (e.g. suspend the server,
>>>>> modify a data source or some other config, then resume).
>>>>>
>>>>> User View:
>>>>>
>>>>> For the users point of view two new operations will be added to the server:
>>>>>
>>>>> suspend(timeout)
>>>>> resume()
>>>>>
>>>>> A runtime only attribute suspend-state (is this a good name?) will also
>>>>> be added, that can take one of three possible values, RUNNING,
>>>>> SUSPENDING, SUSPENDED.
>>>> The SuspendController "state" might be a shorter attribute name and just
>>>> as meaningful.
>>> This will be in the global server namespace (i.e. from the CLI
>>> :read-attribute(name="suspend-state").
>>>
>>> I think the name 'state' is just two generic, which kind of state are we
>>> talking about?
>>>
>>>> When are we in the RUNNING state?  Is that simply the pre-state for
>>>> SUSPENDING?
>>> 99.99% of the time. Basically servers are always running unless they are
>>> have been explicitly suspended, and then they go from suspending to
>>> suspended. Note that if resume is called at any time the server goes to
>>> RUNNING again immediately, as when subsystems are notified they should
>>> be able to begin accepting requests again straight away.
>>>
>>> We also have admin only mode, which is a kinda similar concept, so we
>>> need to make sure we document the differences.
>>>
>>>>> A timeout attribute will also be added to the shutdown operation. If
>>>>> this is present then the server will first be suspended, and the server
>>>>> will not shut down until either the suspend is successful or the timeout
>>>>> occurs. If no timeout parameter is passed to the operation then a normal
>>>>> non-graceful shutdown will take place.
>>>> Will non-graceful shutdown wait for non-daemon threads or terminate
>>>> immediately (call System.exit()).
>>> It will execute the same way it does today (all services will shut down
>>> and then the server will exit).
>>>
>>> Stuart
>>>
>>>>> In domain mode these operations will be added to both individual server
>>>>> and a complete server group.
>>>>>
>>>>> Implementation Details
>>>>>
>>>>> Suspend/resume operates on entry points to the server. Any request that
>>>>> is currently running must not be affected by the suspend state, however
>>>>> any new request should be rejected. In general subsystems will track the
>>>>> number of outstanding requests, and when this hits zero they are
>>>>> considered suspended.
>>>>>
>>>>> We will introduce the notion of a global SuspendController, that manages
>>>>> the servers suspend state. All subsystems that wish to do a graceful
>>>>> shutdown register callback handlers with this controller.
>>>>>
>>>>> When the suspend() operation is invoked the controller will invoke all
>>>>> these callbacks, letting the subsystem know that the server is suspend,
>>>>> and providing the subsystem with a SuspendContext object that the
>>>>> subsystem can then use to notify the controller that the suspend is
>>>>> complete.
>>>>>
>>>>> What the subsystem does when it receives a suspend command, and when it
>>>>> considers itself suspended will vary, but in the common case it will
>>>>> immediatly start rejecting external requests (e.g. Undertow will start
>>>>> responding with a 503 to all new requests). The subsystem will also
>>>>> track the number of outstanding requests, and when this hits zero then
>>>>> the subsystem will notify the controller that is has successfully
>>>>> suspended.
>>>>> Some subsystems will obviously want to do other actions on suspend, e.g.
>>>>> clustering will likely want to fail over, mod_cluster will notify the
>>>>> load balancer that the node is no longer available etc. In some cases we
>>>>> may want to make this configurable to an extent (e.g. Undertow could be
>>>>> configured to allow requests with an existing session, and not consider
>>>>> itself timed out until all sessions have either timed out or been
>>>>> invalidated, although this will obviously take a while).
>>>>>
>>>>> If anyone has any feedback let me know. In terms of implementation my
>>>>> basic plan is to get the core functionality and the Undertow
>>>>> implementation into Wildfly, and then work with subsystem authors to
>>>>> implement subsystem specific functionality once the core is in place.
>>>>>
>>>>> Stuart
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> The
>>>>>
>>>>> A timeout attribute will also be added to the shutdown command,
>>>>> _______________________________________________
>>>>> wildfly-dev mailing list
>>>>> wildfly-dev at lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/wildfly-dev
>>>>>
>>>> _______________________________________________
>>>> wildfly-dev mailing list
>>>> wildfly-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/wildfly-dev
>>> _______________________________________________
>>> wildfly-dev mailing list
>>> wildfly-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/wildfly-dev
>>>
>> _______________________________________________
>> wildfly-dev mailing list
>> wildfly-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/wildfly-dev


More information about the wildfly-dev mailing list