[jboss-jira] [JBoss JIRA] (AS7-5946) jboss-ejb3.xml faults fail silently and result in a hanging thread
jaikiran pai (JIRA)
jira-events at lists.jboss.org
Thu Nov 29 08:56:21 EST 2012
[ https://issues.jboss.org/browse/AS7-5946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12738082#comment-12738082 ]
jaikiran pai commented on AS7-5946:
-----------------------------------
Adding the discussion I and David had about this over IRC on #jboss-as7 channel (complete log available here http://echelog.com/logs/browse/jboss-as7/1353020400 November 16th 2012)
{quote}
[15:41:05] <dmlloyd> what's up Jaikiran?
[15:41:12] <dmlloyd> (didn't see your ping there...)
[15:41:59] <Jaikiran> dmlloyd: if you have some time, i would like to understand a bit about why servicelisteners are considered evil . and if they are evil in all cases or specific cases
[15:42:18] <dmlloyd> sure, it's really simple
[15:42:40] <dmlloyd> basically a service listener means that the state machine for every service it's attached to has to pause in between each state to call the listener
[15:42:52] <dmlloyd> without service listeners, in many cases states can be instantly bypassed
[15:42:57] <dmlloyd> that's the perf issue
[15:43:17] <dmlloyd> the conceptual issue is that performing actions - especially service-impacting actions - in a listener can result in unpredictable effects
[15:43:20] *** igarashitm has quit IRC
[15:43:45] <dmlloyd> for example, imagine a service listener which toggles the service mode between ACTIVE and NEVER when the service becomes down and up
[15:43:59] <dmlloyd> it would cause an indefinite rippling oscillation in the service graph
[15:44:11] <dmlloyd> consume lots of CPU and never terminate
[15:44:34] <dmlloyd> that's a boiled-down case; of course nobody would ever do that, but they might do something similar on accident
[15:45:01] <Jaikiran> i see. so from what i understand the real concern is that the badly written servicelisteners can impact MSC
[15:45:08] <Jaikiran> and MSC has no real control over those
[15:46:06] <Jaikiran> now that i understand that concern, i would like to explain the issue that i was thinking of solving via a servicelistener
[15:46:32] <Jaikiran> so i'm looking into this JIRA AS7-5946
[15:46:33] <jbossbot> jira [AS7-5946] jboss-ejb3.xml faults fail silently and result in a hanging thread [Open (Unresolved) Bug, Major, David Lloyd] https://issues.jboss.org/browse/AS7-5946
[15:46:39] <Jaikiran> there are 2 problems in that JIRA
[15:46:45] <Jaikiran> the first one is simple of fix
[15:47:04] <Jaikiran> the WebDeploymentService does some blocking tasks in start/stop from a MSC thread
[15:47:15] <Jaikiran> and that needs to be fixed to use a server executor
[15:47:19] <Jaikiran> that part is easy to fix
[15:47:28] <jfclere> about https://github.com/jbossas/jboss-as/pull/3326 I think it makes sense to remove the jar file logic, comments?
[15:47:45] <bstansberry> jfclere: +1
[15:48:02] <Jaikiran> now the real problem is that that service triggers web app context initialization which leads to a startup servlet to be invoked in the user app
[15:48:28] <jfclere> bstansberry: ok I will clean the code then ;-)
[15:48:30] <Jaikiran> that servlet invokes an EJB whose backing component has a dependency problem and is never expected to start
[15:49:01] <Jaikiran> the invocation on the view of that component leads to a call to component instance creation which literally waits for the component to start
[15:49:21] <Jaikiran> and since the component start service associated with that bean is DOWN, it never starts
[15:49:25] <Jaikiran> and leads to a deadlock
[15:49:56] *** baranowb has quit IRC
[15:50:07] <Jaikiran> IMO, what that part of code in BasicComponent#waitForComponent is misisng is a way to convey the service startup failure
[15:50:32] *** ousmaneo has joined #jboss-as7
[15:50:36] <Jaikiran> so i was thinking of using a service listener on the component start service to listen to transition of the service going DOWN
[15:50:50] <Jaikiran> so that we can break that indefinite "wait"
[15:50:57] *** AndyTaylor has left #jboss-as7
[15:51:17] <ousmaneo> i'm having "org.datanucleus.api.jpa" is already registered error on jbossas-
[15:51:34] <Jaikiran> does that sound like a good use of the service listener? this is more EE so i'll be running it past stuart too, but i was to understand your concerns (if any) with this proposal
[15:52:37] *** tcrawley is now known as tcrawley|away
[15:53:45] <Jaikiran> btw here's the thread dump of the deadlock if it helps understand the issue http://pastebin.com/4QT5EVf1
[15:54:08] <dmlloyd> the difficulty is that a service can transition to DOWN for a lot of reasons
[15:54:46] <dmlloyd> I agree it's a problem though
[15:55:02] <dmlloyd> I think we have (had?) a solution for a similar issue in the naming subsystem
[15:55:22] <dmlloyd> it used the service substate to determine whether a lookup can ever succeed
[15:55:45] <dmlloyd> I would think that this should be used in this case as well, but I may be oversimplifying
[15:56:17] <Jaikiran> i'll take a look at the naming subsystem and see what was done there
[15:56:23] <dmlloyd> the problem is that even known dependency problems *might* be transient
[15:57:38] <jmesnil> dmlloyd, I was thinking about using service listeners to fix https://issues.jboss.org/browse/AS7-5929 and I see now why that'd a *bad* idea :)
[15:57:39] <jbossbot> jira [AS7-5929] :reload operation clears up the restart-required process-state [Open (Unresolved) Bug, Major, Unassigned] https://issues.jboss.org/browse/AS7-5929
[15:58:27] <Jaikiran> yeah, that's another question i had - a state of DOWN probably doesn't really represent that the dep problem can't be fixed "later", isn't it
[15:58:30] <Jaikiran> ?
[15:58:58] <dmlloyd> no. In fact no single service's state can tell you definitively whether that service ever might successfully start in the future
[16:00:22] <Jaikiran> yeah, it's the "future" which i think is important. i mean a deployment completion (either failure/success) probably should be an indication that the component instance creation sshould no longer "wait"
[16:00:41] <Jaikiran> *should no longer "wait" for the future/later transitions
[16:00:52] <dmlloyd> yes, but if there are still services starting then the deployment cannot be "complete"
[16:00:56] *** ssadeghi has quit IRC
[16:00:57] <Jaikiran> and instead just report a failure to create the component instance
[16:01:00] <dmlloyd> in this case the startup servlet
[16:01:02] *** milestone has quit IRC
[16:01:36] <Jaikiran> yeah, true.
[16:01:48] <dmlloyd> is this a hard problem? yes :)
[16:01:49] *** jdoyle has joined #jboss-as7
[16:01:57] *** mkouba has quit IRC
[16:02:00] <Jaikiran> this one's going to be interesting and tricky :)
[16:02:03] <Jaikiran> yeah!
[16:03:14] *** rbenevides1 has joined #jboss-as7
[16:04:34] *** tdiesler has quit IRC
[16:04:52] *** rbenevides has quit IRC
[16:05:06] <dmlloyd> there are a couple vectors from which we can approach the problem
[16:05:17] <dmlloyd> 1) approach the equality between service start and component start
[16:06:00] <dmlloyd> 2) approach the two-phase JNDI binding methodology that results in no direct dependencies between components which programmatically look up names and those that bind the names
[16:07:09] <dmlloyd> instinctively I think #2 is the more solid approach, but I have no ideas how to achieve any useful alternative
[16:07:55] *** lfryc has quit IRC
[16:07:57] <dmlloyd> we could simply cause the lookup to fail if the target service is in any kind of problem state (regardless of transitivity) and require the user to use @Resource (or similar) to establish the dependency for safety
[16:08:19] <dmlloyd> requiring code changes to user code is non-ideal, but maybe in this case it's OK
[16:08:39] <dmlloyd> (especially given that the changes are portable)
[16:08:50] <dmlloyd> not sure what the backwards compat impact would be though
[16:09:05] <dmlloyd> s/transitivity/transience/
[16:09:07] <Jaikiran> actually the dependency is already established by the user code
[16:09:13] <Jaikiran> it has a @EJB in the startup servlet
[16:09:19] <dmlloyd> ah, in that case it's a simple bug!
[16:09:21] <Jaikiran> but that @EJB adds a dependency on a view
[16:09:25] <Jaikiran> not the component
[16:09:26] *** csams has joined #jboss-as7
[16:09:35] <dmlloyd> the view should in turn have a dep on the component iirc
[16:09:39] *** maeste2 has joined #jboss-as7
[16:09:39] *** ChanServ sets mode: +v maeste2
[16:09:49] * Jaikiran takes a look at the view service
[16:09:59] <dmlloyd> it's been a long time since I've been in that code :)
[16:12:20] <Jaikiran> ok so here's what's going on
[16:12:33] <Jaikiran> the view service has a dep on Create service of the component
[16:12:43] <Jaikiran> which is just responsible for creating the Component
[16:13:02] *** maeste has quit IRC
[16:13:06] *** braoru has joined #jboss-as7
[16:13:27] <Jaikiran> a "started" component is controlled via a separate ComponentStartService on which this view service has no dep
[16:13:50] <Jaikiran> so effectively the invocation uses the view service to get to hold of the created component
[16:13:51] <dmlloyd> ah yes, the two phase problem
[16:14:08] <Jaikiran> and "waits" for the component to be "started" by that other service
[16:14:18] *** mkouba has joined #jboss-as7
[16:14:29] <Jaikiran> yep
[16:14:31] <dmlloyd> you know we have a better solution to that problem in the jboss modules linker - maybe it is applicable here as well
[16:15:19] <dmlloyd> the simplest form of the problem is that any component A that depends on component B needs all of B's dependencies as well, even though there's no way to directly express that without a graph cycle
[16:15:42] *** lance|afk has quit IRC
[16:16:04] <dmlloyd> the current solution as you say is to have services A1 and A2, and B1 and B2, and have A2 depend on B1 so that when A starts, all of its immediate deps are satisfied
[16:16:19] <dmlloyd> of course if there is an unsatisfied transient dep, A might end up in the very blocking situation we see now
[16:16:22] *** jharting has quit IRC
[16:16:47] *** pfrobinson has quit IRC
[16:17:12] <dmlloyd> the solution would be for A1 and B1 to include *information* about all of their immediate dependencies, and have the definitions of A2 and B2 be built from the traversal of their immediate dependencies into the transitive closure of all dependencies
[16:17:31] <dmlloyd> it might require three phases to carry off properly in MSC but the basic idea should work
[16:17:56] <dmlloyd> but then A2 would have a dependency on the complete set of components required to make it go
[16:18:06] <dmlloyd> instead of just the first level of the graph
[16:18:20] * dmlloyd complex solution first!
[16:18:34] <Jaikiran> i kind of understand the solution you are proposing but i'll have to spend some more time to fully understand what its implications are
[16:18:44] <dmlloyd> yeah I expect I'll have to draw some pictures to be sure
[16:18:55] *** clebert has joined #jboss-as7
[16:18:56] *** ChanServ sets mode: +v clebert
[16:18:57] <Jaikiran> actually it's kind of along the lines of what we did in AS6 for the same problem via "switchboard"
[16:19:29] <dmlloyd> yeah I think it'd require a "definition" phase, a "resolve" phase, and a "start" phase...
[16:19:36] <dmlloyd> instead of create/start
[16:20:12] *** tcrawley|away is now known as tcrawley
[16:20:15] <Jaikiran> yeah
{quote}
> jboss-ejb3.xml faults fail silently and result in a hanging thread
> ------------------------------------------------------------------
>
> Key: AS7-5946
> URL: https://issues.jboss.org/browse/AS7-5946
> Project: Application Server 7
> Issue Type: Bug
> Components: EE, EJB
> Affects Versions: 7.1.3.Final (EAP), 7.2.0.Alpha1
> Environment: MacOS X 10.8.2
> java version "1.6.0_35"
> Java(TM) SE Runtime Environment (build 1.6.0_35-b10-428-11M3811)
> Java HotSpot(TM) 64-Bit Server VM (build 20.10-b01-428, mixed mode)
> Reporter: Stephen Coy
> Assignee: David Lloyd
> Attachments: migration-demo.tar.gz
>
>
> The attached maven application project has a subtle typo in a JNDI name in the jboss-ejb3.xml file.
> The deployment process results in a hanging thread shown below. There are no diagnostic log messages at all to indicate what the problem could be.
> A second consequence of this hung thread is that the server process can only be terminated with a "kill -9 <pid>".
> Note that it is deliberately a JEE5 compatible application.
> "management-handler-thread - 4" prio=5 tid=7f843c0bf000 nid=0x10a866000 in Object.wait() [10a864000]
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <7d42d05f8> (a org.jboss.as.controller.ContainerStateMonitor)
> at java.lang.Object.wait(Object.java:485)
> at org.jboss.as.controller.ContainerStateMonitor.awaitContainerStateChangeReport(ContainerStateMonitor.java:158)
> - locked <7d42d05f8> (a org.jboss.as.controller.ContainerStateMonitor)
> at org.jboss.as.controller.ModelControllerImpl.awaitContainerStateChangeReport(ModelControllerImpl.java:442)
> at org.jboss.as.controller.OperationContextImpl.awaitModelControllerContainerMonitor(OperationContextImpl.java:147)
> at org.jboss.as.controller.AbstractOperationContext.doCompleteStep(AbstractOperationContext.java:261)
> at org.jboss.as.controller.AbstractOperationContext.completeStep(AbstractOperationContext.java:211)
> at org.jboss.as.server.deployment.DeploymentHandlerUtil$1.execute(DeploymentHandlerUtil.java:123)
> at org.jboss.as.controller.AbstractOperationContext.executeStep(AbstractOperationContext.java:397)
> at org.jboss.as.controller.AbstractOperationContext.doCompleteStep(AbstractOperationContext.java:284)
> at org.jboss.as.controller.AbstractOperationContext.completeStep(AbstractOperationContext.java:211)
> at org.jboss.as.server.deployment.DeploymentDeployHandler.execute(DeploymentDeployHandler.java:75)
> at org.jboss.as.controller.AbstractOperationContext.executeStep(AbstractOperationContext.java:397)
> at org.jboss.as.controller.AbstractOperationContext.doCompleteStep(AbstractOperationContext.java:284)
> at org.jboss.as.controller.AbstractOperationContext.completeStep(AbstractOperationContext.java:211)
> at org.jboss.as.server.deployment.DeploymentAddHandler.execute(DeploymentAddHandler.java:168)
> at org.jboss.as.controller.AbstractOperationContext.executeStep(AbstractOperationContext.java:397)
> at org.jboss.as.controller.AbstractOperationContext.doCompleteStep(AbstractOperationContext.java:284)
> at org.jboss.as.controller.AbstractOperationContext.completeStep(AbstractOperationContext.java:211)
> at org.jboss.as.controller.CompositeOperationHandler.execute(CompositeOperationHandler.java:85)
> at org.jboss.as.controller.AbstractOperationContext.executeStep(AbstractOperationContext.java:397)
> at org.jboss.as.controller.AbstractOperationContext.doCompleteStep(AbstractOperationContext.java:284)
> at org.jboss.as.controller.AbstractOperationContext.completeStep(AbstractOperationContext.java:211)
> at org.jboss.as.controller.ModelControllerImpl$DefaultPrepareStepHandler.execute(ModelControllerImpl.java:473)
> at org.jboss.as.controller.AbstractOperationContext.executeStep(AbstractOperationContext.java:397)
> at org.jboss.as.controller.AbstractOperationContext.doCompleteStep(AbstractOperationContext.java:284)
> at org.jboss.as.controller.AbstractOperationContext.completeStep(AbstractOperationContext.java:211)
> at org.jboss.as.controller.ModelControllerImpl.internalExecute(ModelControllerImpl.java:126)
> at org.jboss.as.controller.ModelControllerImpl.execute(ModelControllerImpl.java:111)
> at org.jboss.as.controller.remote.ModelControllerClientOperationHandler$ExecuteRequestHandler.doExecute(ModelControllerClientOperationHandler.java:139)
> at org.jboss.as.controller.remote.ModelControllerClientOperationHandler$ExecuteRequestHandler$1.execute(ModelControllerClientOperationHandler.java:108)
> at org.jboss.as.protocol.mgmt.AbstractMessageHandler$2$1.doExecute(AbstractMessageHandler.java:296)
> at org.jboss.as.protocol.mgmt.AbstractMessageHandler$AsyncTaskRunner.run(AbstractMessageHandler.java:518)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:680)
> at org.jboss.threads.JBossThread.run(JBossThread.java:122)
> Locked ownable synchronizers:
> - <7d42a5150> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
> - <7d4f3b598> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list