Hi,

I have come across a few bug reports [1][2] (and a feature request from the Swarm team) recently that are essentially caused by an application being accessed before it is fully deployed. Basically even though we have service dependencies that make sure individual components dependencies are up, once a request has been accepted it can potentially programmatically access other parts of the deployment that are not up yet (basically the same problem we have with graceful shutdown, but in reverse).

I propose we solve this using a 'graceful startup' mechanism, that holds or rejects new requests until a server or deployment is fully started. The specifics of how this would work are:

- If the server is booting all external requests will be held or rejected until the the boot process is complete
- When deploying a new deployment all requests for that deployment will be held or rejected until MSC has attained stability

This would be implemented for the following endpoints/subsystems:

- Undertow will hold requests until the deployment is done (so if you try and load a page while deployment is happening it could be a bit of a wait)
- Remote EJB will hold requests until deployment is done
- mod_cluster will not send availability messages until deployment is done
- JMS will delay message delivery until deployment is done
- EJB persistent timers will not fire until deployment is done
- Possibly some other cases I can't think of right now

One thing I am not really sure about is if we need a configuration switch for hold/reject behavior. e.g. for Undertow the request holding behavior is very developer friendly, as it means they can just hit refresh in their browser and as soon as the redeployment is done the page will display, however I am worried that it might not be ideal for load balancers that may prefer a quick error response that could then be attempted on another node (although if mod_cluster is not sending out availability till the deployment is 100% complete this may not be a big deal).

If you want to see this in action I have a very simple PR at [3] that enables this for Undertow at server boot.

Thoughts?

Stuart