Preserving for posterity a long discussion on this today on #jboss-dev on freenode:
| <bstansberry> got your note on graceful shutdown
| <bstansberry> i'm thinking a bit about what to work on to actually get
something useful in M2
| <bstansberry> that's my priority -- get something actually useful, not just
progress in the right direction
| <ALR> bstansberry: OK, let me look a bit at my notes from before to remember the
issues involved.
| <bstansberry> k
| <ALR> bstansberry: So the requirements we discussed was that:
| <ALR> 1) Tied into the bootstrap
| <ALR> 2) 2-phase prepare/commit-like cycle.
| <ALR> Where in the first phase you continue to process requests but block
incoming.
| <ALR> And then when done w/ processing signal that you're ready for the 2nd
phase
| <bstansberry> yep
| <ALR> Also the mechanism imp has to be decoupled well so we don't leak
anything.
| <bstansberry> yes
| <ALR> But I hadn't yet flushed out how the impl of the shutdown registry
works. I think it can be concurrent (ie. Send requests to all subsystems to shutdown,
then get Future back and when all .isCompleted we can move to phase 2
| <bstansberry> what i'm thinking about now is who needs to be involved in
that vs clustered deployments to actually get things done
| <bstansberry> for sure it needs to be concurrent
| <ALR> bstansberry: I can assume everything to be part of the lifecycle is an MC
bean?
| * balunasj has quit ()
| <ALR> bstansberry: Then they can autowire/inject a "ShutdownRegistry"
type and use it. Then provide an event listener mechanism which each subsystem will
implement
| <ALR> So on install, everyone gets a ShutdownRegistry if they want, and its up
to the component to handle the events.
| <bstansberry> being an MC bean is a fair requirement. might take some doing for
all the services, but what you describe is what i was thinking as well
| <bstansberry> is AS trunk actually using Bootstrap now?
| <ALR> bstansberry: Not a recent enough version, no, I need to merge everything
in.
| <bstansberry> is that part of your M2 plan?
| <ALR> bstansberry: Which, judging from Branch_5_x integration time, is a bit
short of a day. (unless the IDE is playing nicer w/ AS modules and I can see compilation
errors more readily)
| <ALR> bstansberry: It is now. Might as well start knocking off my action
items.
| <bstansberry> ok
| <ALR> bstansberry: But unlikely I'll have it for you this week. You're
chomping at the bit or just want to be sure it's in M2?
| <bstansberry> neither
| <bstansberry> i'm evaluating working on graceful vs working on clustered
deployments
| <bstansberry> figuring out what needs to happen to have something useful for
either, who needs to be involved
| <ALR> bstansberry: If you work on clustered deployments for now that'll give
me some time to first clean up with EmbeddedAS release/6.0.0.M1, merge stuff to trunk,
then incorporate a new API in bootstrap.
| <bstansberry> and will then pick one or the other to ensure something useful
gets in M2
| <bstansberry> yeah. so for GS, the "who" is 1) you for boostrap stuff
2) me for web containers 3) me or paul or someone for EJB3 containers 4) transactions ???
5) JBM/HornetQ ???
| <bstansberry> JBM/HornetQ is IMO optional for "useful". actually just
1) and 2) is enough for "useful"
| <bstansberry> so now i'll think a bit about clustered deployments; same kind
of analysis
| <ALR> Sounds about right, although I think for EJB3 there's more than just
clustered...we need to also halt incoming regular requests.
| * maeste has quit (Remote closed the connection)
| <bstansberry> yes, all of 1-5 isn't just clustered
| * mazz (n=mazz@redhat/jboss/mazz) has left #jboss-dev
| <bstansberry> my earlier intent was to work on domain model stuff, but
that's deferred, so i really want to get something actually done :-)
| <ALR> bstansberry: I also have a vested interest in domain model. :)
| <ALR> For now I'll draw up some notes on a design Wiki and we'll be
better able to pick up after Thanksgiving.,
| <bstansberry> ALR: sure, we all do. but we gotta keep doing stuff that actually
gets released
| <bstansberry> RERO
| <bstansberry> sounds good
| * maeste (n=maeste4(a)194.185.94.10) has joined #jboss-dev
| <dmlloyd> don't forget that domain model is likely to be an AS7 thing
| <ALR> Yup.
| <bstansberry> yes, hence my shift to graceful shutdown or clustered deployments
| <ALR> Though I was hoping for some decoupled thing too. Imagine I want to use
the same interface for starting any arbitrary service, but also configure it.
| <ALR> server.getConfiguration().addService(GrizzlyAdaptor.class);
| <ALR>
server.getConfiguration().as(GrizzlyAdaptor.class).setBindPort(8081).setBindHost("localhost");
| <ALR> server.start() < Grizzly comes up.
| <dmlloyd> which still doesn't let you configure more than one GrizzlyAdaptor
:)
| <ALR> Excuuuuuuuse my psudocode
| <ALR> Though IMO more than one Grizzly gets dangerous; they're difficult to
outrun.
| <dmlloyd> play dead
| <ALR> We already support that.
| <bstansberry> OT: why is Grizzly getting into the AS?
| <ALR> bstansberry: It's not necessarily into AS. I've been playing with
the idea of just making some runtime into which you can install/start any service with
some adaptor, and Grizzly is an easy impl target.
| <ALR> Because they were built for embeddability and have a unified config API,
etc
| <ALR> May be a poor example. Pretend I said "JBossWeb". :)
| <bstansberry> ALR: LOL. ok; i'm just channeling remy a bit here
| <ALR> bstansberry: IMO JBossWeb should be as easy to setup.
| <ALR> Else we provide something that is. Or have any number of options for a
servlet container.
| <ALR> And let the user choose.
| <ALR> Again, not necessarily for AS.
| <jpederse> ALR: well, I would go with Grizzly or Jetty as the PoC ;)
| <ALR> pgier: Ping.
| * maeste has quit (Remote closed the connection)
| <pgier> ALR: hi
| <ALR> pgier: Sorry, so how do I set the profile in a local build to build the
dist?
| <pgier> you mean the zip?
| <ALR> (in tags/6.0.0.M1? )
| <ALR> Yeah.
| <pgier> -Pdist-zip
| <ALR> pgier: ./build.sh -Pdist-zip ?
| <ALR> Or the maven build from trunk works there now?
| <pgier> ah, right, I forgot that it's using ant
| <pgier> you have to do build.sh first
| <pgier> and then mvn -Pdist-zip package
| <ALR> pgier: Ah OK. Thanks.
| * rploski (n=rploski@redhat/jboss/rploski) has joined #jboss-dev
| * ChanServ gives voice to rploski
| <pgier> we could probably add a target to the ant build to include the zip
| <ALR> Looking forward to trunk and one build to rule them all.
| <ALR> pgier: It's tagged already; just so long as it gets released into the
M2 repo eventually we'll be all good.
| <ALR> My EmbeddedAS testing for the past few weeks has been stale
| <pgier> ok, I'll check with Rajesh to make sure he uploads that
| <ALR> After renaming artifacts, I've been using the same ol' snapshots
of the old locations
| * gerdogdu (n=gurkaner(a)85.104.134.232) has joined #jboss-dev
| * aslak (n=aslak(a)212-71-93-70.dsl.no.powertech.net) has joined #jboss-dev
| <ALR> pgier: Sorry.
| <pgier> ALR: ?
| <ALR> pgier: "mvn clean install -Pdist-zip" doesn't give me an
org/jboss/jbossas/jboss-as-distribution in my local.
| <pgier> I think the problem is a hard coded path in the assembly descriptor
| <pgier> when the version changed, then it broke the assembly
| <pgier> take a look at src/assembly/jboss-dist.xml
| <ALR> pgier: Are we considering the tag immutable?
| <ALR> Ooh yeah.
| <pgier> probably up to Nihility whether it should be fixed
| <ALR> IMHO it should be this assembly which is also the AS distribution which
goes to
SF.net; so we don't have different things there and in M2 repo.
| * gerdogdu (n=gurkaner(a)85.104.134.232) has left #jboss-dev
| <pgier> ALR: yeah, I think that's a good idea
| <Nihility> i just patched the version
| <Nihility> do an update
| <ALR> pgier: :) Which opens a bad door. It'd have to be that artifact which
also gets tested in QE.
| <ALR> Nihility: Thanks.
| * cbrock (n=cbrock@redhat/jboss/cbrock) has joined #jboss-dev
| * ChanServ gives voice to cbrock
| <ALR> Or we one-off this M1, and then in trunk it's all a Maven build
anyway, and this problem goes away.
| * asoldano has quit ("I'm leaving")
| <ALR> But today I'll test EmbeddedAS with it, which should give some decent
coverage of major moving parts.
| <ALR> bstansberry: If you feel voyeuristic:
https://jira.jboss.org/jira/browse/JBBOOT-116
| <bstansberry> ALR: thanks
| <ALR> I suppose a design Wiki is in order ,but I think the description in the
JIRA should suffice for now.
| * rploski has quit ()
| <Jaikiran> Nihility: btw, would there be a merge of the tagged M1 with trunk or
is it upto individuals to port their fixes from M1 to trunk?
| <Jaikiran> fixes within the AS code base
| <ALR> Jaikiran: A mass merge wouldn't work.
| <dmlloyd> yeah, it's too divergent
| <Nihility> it has to be individual updates
| <ALR> 1) Too much has changed 2) The source locations have moved
| <ALR> src/main/java in turnk
| <dmlloyd> I did a diff of just component matrix and it made me sad :)
| <ALR> *trunk
| <Jaikiran> hmm, yeah
| <Jaikiran> i'll scan through some of the jiras i fixed for M1 and see if i
have some pending ports
| <ALR> I have enough ports to be a harbor.
| <Jaikiran> :)
| <jpederse> ALR: regarding JBBOOT-116, there have to be a layer in MC also, as
services should be stopped in the reverse order as they started
| <jpederse> ALR, bstansberry: so maybe talk with Ales about that
| <ALR> jpederse: They already are...(well the bootstraps are shut down in reverse
order anyway)
| <ALR> jpederse: But bootstrap sits above MC. So they really shouldn't
depend upon this at all.
| <jpederse> ALR: I'm more thinking in the lines of parallel deployments once
MC trunk hits AS trunk
| <jpederse> ALR: but yeah, ideally you shouldn't worry
| <ALR> jpederse: Yeah, this is an opt-in mechanism for subsystems.
| <dmlloyd> if the thing which accepts requests depends on the thing that
processes requests, then the request acceptor should naturally stop before the request
executor
| <jpederse> ALR: yup, the real issue is really proper dependency definitions
| <ALR> Hmm, I wonder if I should use jboss-threads as the concurrent broadcast
phase 1 executor. :) Naaaah. :D
| <bstansberry> jpederse, ALR: still, it's a good point. for a full undeploy,
the MC handles the dependencies, but for the "phase 1" part there needs to be
similar behavior
| <jpederse> ALR: no dependencies, thank you
| <ALR> jpederse: No API deps.
| <ALR> bstansberry: Then I'm not understanding what we'd ask of MC to do
here.
| <bstansberry> ALR: well just imagine a weld-type app with an SFSB container
fronted by a web container
| <ALR> k.
| <bstansberry> you don't want the SFSB container to start rejecting new
sessions before
| <bstansberry> hmm, never mind :)
| <jpederse> ALR: you are f.ex. asking JCA to shutdown before the SLSB container
| <bstansberry> no, don't never mind ;-)
| <ALR> Not a bad point.
| <bstansberry> yeah, exactly
| <ALR> Right, they need to stop accepting new session in order.
| <ALR> *sessions
| <jpederse> ALR: but the SLSB already have the request
| <jpederse> ALR: but the request hasn't reached JCA yet
| <ALR> bstansberry: So for stuff with explicit deps, those can't be
concurrent
| <jpederse> ALR: kaboom
| <ALR> Web > EJB3 > JCA, in that order.
| <bstansberry> if they are smart, the SFSB container knows it's only fronted
by the web container, so it doesn't register at all
| <ALR> Hmm, I don't want to work on this anymore. :)
| <jpederse> ALR: so in a sense it is up to the kernel to notify in the correct
order
| <bstansberry> ALR: they can be concurrent, but they need to be mediated by the
MC dependency mechanism. analogous to parallel deployment
| <ALR> bstansberry: The thing is, for anything that opts in needs to know its
dependencies
| <jpederse> bstansberry: kernel level service IMHO
| <ALR> bstansberry, jpederse: Right, so now this becomes a feature of MC. Not an
add-on.
| <jpederse> bstansberry: its the kernel single entity that knows the dependency
chain
| <bstansberry> jpederse: yep. having the SFSB container try to understand that is
a hack
| <jpederse> ALR, bstansberry: and then we have all the OSGi stuff to worry about
too
| <dmlloyd> OSGi is just another name for "don't worry about deps,
that's someone else's problem"
| <jpederse> ALR, bstansberry: so def. something for the kernel ;)
| <jpederse> ALR, bstansberry: well, at least some of it
| <jpederse> dmlloyd: until you start deploying multiple containers
| <ALR> Ah, I remember discussing this w/ Carlo
| <ALR> We'd also looked at a ThreadLoad mechanism.
| <jpederse> dmlloyd: f.ex. two different JCA containers - 1.5 and 1.6
| <ALR> Where you must know the entry/exit point of each request.
| <ALR> So in other words, EJB3 can gracefully shut down.
| <ALR> And block all incoming requests.
| <ALR> UNLESS there's a ThreadLocal saying "hey I came in from JbossWeb,
serve me".
| <jpederse> yuck
| <ALR> Ugly with its advantages.
| <jpederse> ThreadLocal is not a good contract for inter container communication
| <jpederse> ALR: well, the whole thing needs a PoC, so it 'could' be a
first implementation - just to expose all the problems
| <bstansberry> jpederse: yes. we need to think of ways to add useful
functionality
| <ALR> Personally I dislike things have fly under the API like sysprops and
ThreadLocal. But here we can say: "Whomever sets it is responsible for unsetting
it".
| <ALR> Else what's the difference between this 2-phase graceful shutdown and
a regular MC lifecycle phase?
| <ALR> stop, destroy
| <jpederse> ALR: yeah, but f.ex. all work done in JCA is done in each own thread
- and they can be long running processes - so there have to be a well-defined contract
| <ALR> Or adding a new lifecycle phase? (pre-stop) to halt processing of new
requests
| <ALR> Actually that seems more likely. New lifecycle phase.
| <ALR> Then it's all built into MC from the get-go.
| <jpederse> ALR: yeah, that could be a possible solution
| <ALR> Looks easiest too.
| <jpederse> ALR: but you still have the problem with my use-case
| * kconner (n=kevin@redhat/jboss/kconner) has joined #jboss-dev
| * ChanServ gives voice to kconner
| <bstansberry> jpederse: what problem is that again?
| <ALR> jpederse: I think when getting the lifecycle callback to @PreStop your
subsystem would have to determine if it wanted to halt/interrupt any long-running Tasks.
| <jpederse> bstansberry: incoming request to SLSB container that hasn't
reached f.ex. JCA yet
| <jpederse> bstansberry: and JCA is notified of the shutdown before the requests
hits
| <jpederse> bstansberry: so it'll block since there are currently no active
work
| <ALR> bstansberry: That's handled.
| <ALR> SLSB depends on JCA. So JCA won't get the @PreStop event until EJB3
@PreStop is done.
| <ALR> I mean jpederse^
| <jpederse> ALR: ok, that would solve it
| <jpederse> ALR: now you will have to determine to which beans to send the
notification to first...
| <ALR> Which again means that all subsystems must have explicitly set their deps
correctly. :)
| <bstansberry> ALR: are there a lot of unexpressed dependencies, e.g. within an
EJB3 app
| <ALR> I'm SURE there are.
| <bstansberry> e.g. SFSB calls into SLSB
| <ALR> bstansberry: That should tie into EJB3 Containers becoming first-class MC
beans.
| * bstansberry starts thinking in terms of a web-tier only initial version
| <jpederse> ALR, bstansberry: I think that the first use-case to solve is to look
at the entire AS as one component
| <jpederse> ALR, bstansberry: then solve the problem with requests coming from
the "outside" -- web, ...
| <ALR> Why draw any distinction?
| <jpederse> bstansberry: brain clustering again :)
| <dmlloyd> the problem with making random things be MC beans is that the MC
lifecycle states don't make sense
| <ALR> A deployment may create any number of components.
| <ALR> The components should have well-defined dependencies anyway, otherwise
they boot by luck alone.
| <dmlloyd> there should only be two states: "up" and "not
up". If you depend on e.g. classloading "phase", then that should be a
separate target for that "phase"
| <ALR> So when we bring them down, do so in reverse order. The only thing
we're adding now is a new phase to keep servicing but stop listening on new requests.
| <dmlloyd> but that's just a pet peeve of mine I guess
| * whitingjr (n=whitingj(a)gondolin.ncl.ac.uk) has left #jboss-dev
| <jpederse> dmlloyd: I agree, the rest are internal kernel states only
| <ALR> "Phase" is a concern of the environment, not an individual
bean.
| <dmlloyd> how much time do they spend struggling over the fact that
"states" need to be extensible, or that the default ones do not suffice
| <ALR> Beans opt-in to take action on phase events triggered by the environment
though.
| <jpederse> ALR: yeah, so start simple with the web container as a PoC
| <jpederse> ALR: or JNDI
| <ALR> Everything depends on JNDI.
| <ALR> :)
| <jpederse> ALR: yeah, I mean not in-vm reuqests ;)
| * cbrock_ (n=cbrock__@redhat/jboss/cbrock) has joined #jboss-dev
| * ChanServ gives voice to cbrock_
| <jpederse> ALR: I think it would be a lot simpler to alter the naming server to
allow graceful shutdown
| <jpederse> bstansberry: another idea ^
| <bstansberry> jpederse: JNDI would help, but i don't think it would be
reliable enough though
| <ALR> jpederse: And why wouldn't that just be another addition of @PreStop
lifecycle event handler in the naming server?
| <bstansberry> who knows when the client is going to invoke on whatever they
looked up?
| <jpederse> ALR: I said, first use-case
| <ALR> jpederse: MC dep mechansim would fire stuff off in order ,JNDI would come
near last as everything depends upon it...
| <ALR> jpederse: Right, I guess I'm still working the angle where we
decompose into parts:
| <ALR> 1) A mechanism (probably new lifecycle phase)
| <ALR> 2) Every subsystem takes advantage of it
| <ALR> 3) Analyze the MC dep graphs to ensure that each component is declaring
its deps fully
| <jpederse> ALR: correct, but what happens if a container returns false for the
gracefulShutdown() method -- e.g. I can't stop now, ask later -- or do you want it
blocking in the entire chain ?
| <ALR> jpederse: There's no return value on lifecycle callbacks; it's all
blocking.
| <jpederse> ALR: ok, in that case - throws an exception
| * ggear (n=ggear(a)host-83-146-13-110.dslgb.com) has joined #jboss-dev
| <jpederse> ALR: but I'm getting too far into the use-cases currently....
| <ALR> jpederse: Same as if any lifecycle method throws an exception. MC is
handling it.
| <ALR> jpederse: Probably pages of stack.
| <jpederse> ALR: yeah, but the feature is called "Graceful
shutdown"...
| <bstansberry> jpederse: a container doesn't get to veto a shutdown. whoever
requests it provides a max wait time and if that is exceeded, the shutdown occurs
| <jpederse> ALR: not shutdown-kaboom
| <jpederse> bstansberry: that would be a way
| <jpederse> bstansberry: but its blocking all the way down the chain
| <jpederse> bstansberry: f.ex. stopping long running JCA threads can take some
time
| <jpederse> bstansberry: just something to think about...
| <bstansberry> jpederse: sessions as well. could easily by 30 minutes
| <jpederse> bstansberry: yup
| <bstansberry> jpederse: but the max timeout would need to be an overall time,
i.e. not applied independently for each item in the chain
| <ALR> jpederse: So now we've got a discussion about the shutdown impl vs.
clients.
| <ALR> Lifecycle states are an open-ended contract, anyone can abuse it.
| <ALR> Same thing with startup.
| <ALR> I can make AS start in 30 minutes by installing a bean w/ Thread.sleep in
@Start
| <bstansberry> yeah, but a 30 minute session timeout is the default for most
webapps
| <dmlloyd> I could make the container die by using System.exit()
| <jpederse> the KaboomMCService :)
| <ALR> So far as the impl mechanism as a lifecycle state, I don't see the
problem. We notify the subsystem, and now its their responsibility to cleanly shut down
as quickly as possible.
| <dmlloyd> let's assume that start() will generally execute as quickly as
possible (non-blocking) and stop() will block until everything is down.
| <ALR> bstansberry: We have to consider sessions, or just requests?
| <dmlloyd> nothing else really makes sense
| <ALR> I think graceful shutdown just means that requests in process come out
OK.
| <ALR> We don't guarantee that new requests for a current session are gonna
be processed...do we?
| <bstansberry> graceful shutdown == i don't lose my session.
| <jpederse> ALR: well, the request have to be migrated to another server in the
cluster
| <bstansberry> the whole point is to not lose sessions
| <ALR> bstansberry: On reboot your session can stillbe there.
| <ALR>
| <ALR> bstansberry: Or maybe that's a config option for JBossWeb. "Wait
on all sessions". During @PreStop it can handle how it likes.
| <jpederse> bstansberry: yeah, I guess we also have the case where there is only
one machine - good point
| <bstansberry> ALR: yes. a clustered web app can be a lot quicker since it can
understand that it has replicated the session
| <bstansberry> ALR: problem with counting on session persistence is it can take a
long time to restart the server
| <ALR> bstansberry, jpederse: I'll start a Thread on AS Development forum
| <bstansberry> cool. we haven't even gotten into the transaction manager
issues :)
| <jpederse> haha
| <ALR> All ears.
| <bstansberry> client has no open SFSB sessions, no ongoing SLSB requests.
| <bstansberry> so, EJB can stop, right?
| <bstansberry> wrong
| <bstansberry> client has an ongoing transaction
| <bstansberry> and the TM is later in the chain vs the EJB, so counting on
@PreStop in reverse order doesn't handle it either
| <dmlloyd> well the EJB shutdown process just takes open transactions into
account, that's all
| <bstansberry> dmlloyd: yeah, i guess the container will have to track all open
transactions
| <bstansberry> dmlloyd: whether or not it actually has to do anything when they
commit/rollback
| <ALR> bstansberry, dmlloyd, jpederse:
http://www.jboss.org/index.html?module=bb&op=viewtopic&p=4267115
| <ALR> bstansberry: In that case an open Tx is the same as a session
| <dmlloyd> I still don't see any case where a separate, graceful stage is
needed
| <ALR> bstansberry: So we have to ask the TxManager for all Txs for that
component...I'm not familiar enough with those APIs to know how that works in practice
though.
| <dmlloyd> just gonna make things more complex...
| <ALR> dmlloyd: What would you do? Put it all in @Stop?
| <dmlloyd> <dmlloyd> if the thing which accepts requests depends on the
thing that processes requests, then the request acceptor should naturally stop before the
request executor
| <ALR> Right.
| <dmlloyd> then stop just has to wait until everything's done, no matter how
long it takes
| <ALR> This puts the request acceptor stuff stopping in @PreStop, Executors stop
in @Stop
| <dmlloyd> ungraceful shutdown can be implemented via interrupt. If stop is
interrupted with the "kill" flag set to true, burn down the house and get out of
there as quick as possible.
| <ALR> But it's not like things are now separated out enough that
"acceptors" are different components.
| <ALR> For example the entry point to EJB3 stuff can be remoting.
| <ALR> And we don't have one remoting acceptor component per
Container/Deployment
| <dmlloyd> that seems like an easier solution than adding another phase.
| <dmlloyd> if you have one MC component per container, why not two?
| <jpederse> reminds me of
http://www.youtube.com/watch?v=NXbhfwHorlI
| <dmlloyd> "acceptor" and "doer"
| <ALR> Because of unified entry points.
| <ALR> dmlloyd: Each webapp gets its own frontend acceptor?
| <ALR> dmlloyd: Or we break up JBossWeb to have a separate acceptor?
| <dmlloyd> nah, each container gets its own acceptor
| <dmlloyd> jbossweb (for example) talks to that, not the actual container
| <ALR> EJB3 has container per bean.
| <ALR> But one remoting connector
| <dmlloyd> then when the EJB is undeployed, the acceptor is stopped
View the original post :
http://www.jboss.org/index.html?module=bb&op=viewtopic&p=4267166#...
Reply to the post :
http://www.jboss.org/index.html?module=bb&op=posting&mode=reply&a...