They are actually orthogonal, a server can be in both RESTART_REQUIRED
and any one of the suspend states.
RESTART_REQUIRED is very much tied to services and the management model,
while suspend/resume is a runtime only thing that should not touch the
state of services.
Dimitris Andreadis wrote:
Why not extend the states of the existing 'server-state'
(STARTING, RUNNING, SUSPENDING, SUSPENDED, RESTART_REQUIRED RUNNING)
On 10/06/2014 04:40, Stuart Douglas wrote:
> Scott Marlow wrote:
>> On 06/09/2014 06:38 PM, Stuart Douglas wrote:
>>> Server suspend and resume is a feature that allows a running server to
>>> gracefully finish of all running requests. The most common use case for
>>> this is graceful shutdown, where you would like a server to complete all
>>> running requests, reject any new ones, and then shut down, however there
>>> are also plenty of other valid use cases (e.g. suspend the server,
>>> modify a data source or some other config, then resume).
>>> User View:
>>> For the users point of view two new operations will be added to the server:
>>> A runtime only attribute suspend-state (is this a good name?) will also
>>> be added, that can take one of three possible values, RUNNING,
>>> SUSPENDING, SUSPENDED.
>> The SuspendController "state" might be a shorter attribute name and
>> as meaningful.
> This will be in the global server namespace (i.e. from the CLI
> I think the name 'state' is just two generic, which kind of state are we
> talking about?
>> When are we in the RUNNING state? Is that simply the pre-state for
> 99.99% of the time. Basically servers are always running unless they are
> have been explicitly suspended, and then they go from suspending to
> suspended. Note that if resume is called at any time the server goes to
> RUNNING again immediately, as when subsystems are notified they should
> be able to begin accepting requests again straight away.
> We also have admin only mode, which is a kinda similar concept, so we
> need to make sure we document the differences.
>>> A timeout attribute will also be added to the shutdown operation. If
>>> this is present then the server will first be suspended, and the server
>>> will not shut down until either the suspend is successful or the timeout
>>> occurs. If no timeout parameter is passed to the operation then a normal
>>> non-graceful shutdown will take place.
>> Will non-graceful shutdown wait for non-daemon threads or terminate
>> immediately (call System.exit()).
> It will execute the same way it does today (all services will shut down
> and then the server will exit).
>>> In domain mode these operations will be added to both individual server
>>> and a complete server group.
>>> Implementation Details
>>> Suspend/resume operates on entry points to the server. Any request that
>>> is currently running must not be affected by the suspend state, however
>>> any new request should be rejected. In general subsystems will track the
>>> number of outstanding requests, and when this hits zero they are
>>> considered suspended.
>>> We will introduce the notion of a global SuspendController, that manages
>>> the servers suspend state. All subsystems that wish to do a graceful
>>> shutdown register callback handlers with this controller.
>>> When the suspend() operation is invoked the controller will invoke all
>>> these callbacks, letting the subsystem know that the server is suspend,
>>> and providing the subsystem with a SuspendContext object that the
>>> subsystem can then use to notify the controller that the suspend is
>>> What the subsystem does when it receives a suspend command, and when it
>>> considers itself suspended will vary, but in the common case it will
>>> immediatly start rejecting external requests (e.g. Undertow will start
>>> responding with a 503 to all new requests). The subsystem will also
>>> track the number of outstanding requests, and when this hits zero then
>>> the subsystem will notify the controller that is has successfully
>>> Some subsystems will obviously want to do other actions on suspend, e.g.
>>> clustering will likely want to fail over, mod_cluster will notify the
>>> load balancer that the node is no longer available etc. In some cases we
>>> may want to make this configurable to an extent (e.g. Undertow could be
>>> configured to allow requests with an existing session, and not consider
>>> itself timed out until all sessions have either timed out or been
>>> invalidated, although this will obviously take a while).
>>> If anyone has any feedback let me know. In terms of implementation my
>>> basic plan is to get the core functionality and the Undertow
>>> implementation into Wildfly, and then work with subsystem authors to
>>> implement subsystem specific functionality once the core is in place.
>>> A timeout attribute will also be added to the shutdown command,
>>> wildfly-dev mailing list
>> wildfly-dev mailing list
> wildfly-dev mailing list
wildfly-dev mailing list