[wildfly-dev] Graceful startup

Thu Apr 21 16:39:56 EDT 2016

On 4/21/16 2:07 PM, John Doyle wrote:
> On Thu, Apr 21, 2016 at 2:44 PM, Brian Stansberry
> <brian.stansberry at redhat.com> wrote:
>> On 4/20/16 9:05 PM, Stuart Douglas wrote:
>>> Hi,
>>>
>>> I have come across a few bug reports [1][2] (and a feature request from
>>> the Swarm team) recently that are essentially caused by an application
>>> being accessed before it is fully deployed. Basically even though we
>>> have service dependencies that make sure individual components
>>> dependencies are up, once a request has been accepted it can potentially
>>> programmatically access other parts of the deployment that are not up
>>> yet (basically the same problem we have with graceful shutdown, but in
>>> reverse).
>>>
>>> I propose we solve this using a 'graceful startup' mechanism, that holds
>>> or rejects new requests until a server or deployment is fully started.
>>> The specifics of how this would work are:
>>>
>>> - If the server is booting all external requests will be held or
>>> rejected until the the boot process is complete
>>> - When deploying a new deployment all requests for that deployment will
>>> be held or rejected until MSC has attained stability
>>>
>>> This would be implemented for the following endpoints/subsystems:
>>>
>>> - Undertow will hold requests until the deployment is done (so if you
>>> try and load a page while deployment is happening it could be a bit of a
>>> wait)
>>> - Remote EJB will hold requests until deployment is done
>>> - mod_cluster will not send availability messages until deployment is done
>>
>> Isn't this what mod_cluster does on a per-deployment basis anyway? The
>> basic idea of mod_cluster is the LB isn't notified the context is
>> available on a backend server until it's....available.
>
> True, but not everyone is using mod_cluster.
>

Sure, but the statement I was commenting on was about proposed new 
behavior of our mod_cluster subsystem. But the proposed new behavior 
sounds like what the existing behavior should already be. So I wonder if 
I'm missing something.

>>
>>> - JMS will delay message delivery until deployment is done
>>> - EJB persistent timers will not fire until deployment is done
>>> - Possibly some other cases I can't think of right now
>>>
>>> One thing I am not really sure about is if we need a configuration
>>> switch for hold/reject behavior. e.g. for Undertow the request holding
>>> behavior is very developer friendly, as it means they can just hit
>>> refresh in their browser and as soon as the redeployment is done the
>>> page will display, however I am worried that it might not be ideal for
>>> load balancers that may prefer a quick error response that could then be
>>> attempted on another node (although if mod_cluster is not sending out
>>> availability till the deployment is 100% complete this may not be a big
>>> deal).
>>>
>>> If you want to see this in action I have a very simple PR at [3] that
>>> enables this for Undertow at server boot.
>>>
>>> Thoughts?
>>>
>>> Stuart
>>>
>>> [1] https://issues.jboss.org/browse/WFLY-6402
>>> [2] https://issues.jboss.org/browse/JBEAP-867
>>> [3] https://github.com/wildfly/wildfly/pull/8858
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> wildfly-dev mailing list
>>> wildfly-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/wildfly-dev
>>>
>>
>>
>> --
>> Brian Stansberry
>> Senior Principal Software Engineer
>> JBoss by Red Hat
>> _______________________________________________
>> wildfly-dev mailing list
>> wildfly-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/wildfly-dev

-- 
Brian Stansberry
Senior Principal Software Engineer
JBoss by Red Hat