On 04/20/2016 09:05 PM, Stuart Douglas wrote:
Hi,
I have come across a few bug reports [1][2] (and a feature request from
the Swarm team) recently that are essentially caused by an application
being accessed before it is fully deployed. Basically even though we
have service dependencies that make sure individual components
dependencies are up, once a request has been accepted it can potentially
programmatically access other parts of the deployment that are not up
yet (basically the same problem we have with graceful shutdown, but in
reverse).
I propose we solve this using a 'graceful startup' mechanism, that holds
or rejects new requests until a server or deployment is fully started.
The specifics of how this would work are:
- If the server is booting all external requests will be held or
rejected until the the boot process is complete
- When deploying a new deployment all requests for that deployment will
be held or rejected until MSC has attained stability
This would be implemented for the following endpoints/subsystems:
- Undertow will hold requests until the deployment is done (so if you
try and load a page while deployment is happening it could be a bit of a
wait)
- Remote EJB will hold requests until deployment is done
- mod_cluster will not send availability messages until deployment is done
- JMS will delay message delivery until deployment is done
- EJB persistent timers will not fire until deployment is done
- Possibly some other cases I can't think of right now
One thing I am not really sure about is if we need a configuration
switch for hold/reject behavior. e.g. for Undertow the request holding
behavior is very developer friendly, as it means they can just hit
refresh in their browser and as soon as the redeployment is done the
page will display, however I am worried that it might not be ideal for
load balancers that may prefer a quick error response that could then be
attempted on another node (although if mod_cluster is not sending out
availability till the deployment is 100% complete this may not be a big
deal).
I think load balancers should not have a server that is being started up
in the rotation anyway; this already probably doesn't work great today.
I like the idea overall.
--
- DML