[JBoss JIRA] (WFCORE-297) HC should remember the 'run' state of the server instances after crash or shutdown
by Brian Stansberry (JIRA)
[ https://issues.jboss.org/browse/WFCORE-297?page=com.atlassian.jira.plugin... ]
Brian Stansberry reassigned WFCORE-297:
---------------------------------------
Assignee: Emmanuel Hugonnet (was: Brian Stansberry)
> HC should remember the 'run' state of the server instances after crash or shutdown
> ----------------------------------------------------------------------------------
>
> Key: WFCORE-297
> URL: https://issues.jboss.org/browse/WFCORE-297
> Project: WildFly Core
> Issue Type: Feature Request
> Components: Domain Management
> Reporter: Wolf-Dieter Fink
> Assignee: Emmanuel Hugonnet
> Labels: EAP, todo
>
> The host controller should save which server is currently up and running. This would allow the host controller to bring up all previously running instances on a restart.
> The idea is to support the same behavior that other application server (i.e WebLogic) supports.
> If a server is started or stopped during the lifetime of the DC/HC it should be in the same state after shutdown the DC/HC or a system crash.
> This can be achieved by an optional flag 'set-auto-start-on-start-stop' where the default is false which is the current behaviour.
> If set to true, a start of the server instance will set auto-start=true and a stop auto-start=false.
> If the server should not be started after a crash for any reason, this can be simple done by setting auto-start=false within the configuration, after starting the server the flag will be set b/c of the 'set-auto-start-on-start-stop' flag.
--
This message was sent by Atlassian JIRA
(v6.3.8#6338)
10 years, 1 month
[JBoss JIRA] (WFCORE-232) Unclean shutdown of deployment scanner
by Brian Stansberry (JIRA)
[ https://issues.jboss.org/browse/WFCORE-232?page=com.atlassian.jira.plugin... ]
Brian Stansberry reassigned WFCORE-232:
---------------------------------------
Assignee: Emmanuel Hugonnet
> Unclean shutdown of deployment scanner
> --------------------------------------
>
> Key: WFCORE-232
> URL: https://issues.jboss.org/browse/WFCORE-232
> Project: WildFly Core
> Issue Type: Bug
> Components: Domain Management
> Affects Versions: 1.0.0.Alpha11
> Reporter: Brian Stansberry
> Assignee: Emmanuel Hugonnet
>
> I happened to see this in a testsuite log file:
> {code}
> 2014-11-07 02:59:41,808 ERROR [org.jboss.as.server.deployment.scanner] (DeploymentScanner-threads - 1) WFLYDS0012: Scan of /opt/buildAgent/work/8feb0abb503148fe/testsuite/manualmode/target/deployment-test-bc636b67-df54-462d-9a13-3618a1755d3f threw Exception: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@146a4b2 rejected from java.util.concurrent.ScheduledThreadPoolExecutor@e6cbcd[Shutting down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = 3]
> at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
> at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
> at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:325)
> at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:530)
> at java.util.concurrent.ScheduledThreadPoolExecutor.execute(ScheduledThreadPoolExecutor.java:619)
> at org.jboss.as.controller.ModelControllerImpl$3.executeAsync(ModelControllerImpl.java:669)
> at org.jboss.as.controller.ModelControllerImpl$3.executeAsync(ModelControllerImpl.java:605)
> at org.jboss.as.server.deployment.scanner.DefaultDeploymentOperations.deploy(DefaultDeploymentOperations.java:61)
> at org.jboss.as.server.deployment.scanner.FileSystemDeploymentService.scan(FileSystemDeploymentService.java:449)
> at org.jboss.as.server.deployment.scanner.FileSystemDeploymentService$UndeployScanRunnable.run(FileSystemDeploymentService.java:538)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> at org.jboss.threads.JBossThread.run(JBossThread.java:320)
> {code}
> DeploymentScannerService is calling stopScanner() on the scanner and then is shutting down the executor. But stopScanner does not wait for in progress tasks to complete, nor do the in-progress tasks recognize that shutdown is occurring and handle any problems differently (e.g. don't toss the above in the log.)
> Waiting for tasks to complete could be tricky, e.g. imagine this race:
> 1) Admin submits op to shutdown/reload, which has the controller lock and is now stopping services.
> 2) Scan job kicks off, finds changes and wants to modify the model, so the task is blocking waiting for the controller lock.
> Deadlock.
> Now, this may not be a problem, as server reload tells MSC to remove the root service on the way out (i.e. after the bit where it blocks for stability) and then MSC threads take over, allowing the op thread to continue and release the lock. The shutdown handler spawns a thread to call System.exit.
> But ^^^ is just a quick look, so be cautious!
--
This message was sent by Atlassian JIRA
(v6.3.8#6338)
10 years, 1 month
[JBoss JIRA] (WFCORE-402) Get core resources and operations out of the controller module
by Brian Stansberry (JIRA)
[ https://issues.jboss.org/browse/WFCORE-402?page=com.atlassian.jira.plugin... ]
Brian Stansberry updated WFCORE-402:
------------------------------------
Fix Version/s: (was: 1.0.0.Beta1)
> Get core resources and operations out of the controller module
> --------------------------------------------------------------
>
> Key: WFCORE-402
> URL: https://issues.jboss.org/browse/WFCORE-402
> Project: WildFly Core
> Issue Type: Task
> Components: Domain Management
> Reporter: Brian Stansberry
>
> The controller module has a lot of classes related to core resources (resource defs, op handlers, description providers for things like interfaces) that are only in there so they can be commonly accessible to both server and host-controller. This is unnecessary, since host-controller depends on server.
> Either move these to server or create a separate module for this stuff; restrict the controller module to the true core ModelController stuff.
--
This message was sent by Atlassian JIRA
(v6.3.8#6338)
10 years, 1 month
[JBoss JIRA] (WFCORE-398) Avoid running out of threads when connecting to the DC from a slave to pull down missing data
by Brian Stansberry (JIRA)
[ https://issues.jboss.org/browse/WFCORE-398?page=com.atlassian.jira.plugin... ]
Brian Stansberry resolved WFCORE-398.
-------------------------------------
Fix Version/s: (was: 1.0.0.Beta1)
Resolution: Out of Date
I don't believe this thread starvation problem still exists, as the pool used for these requests is not limited. Emmanuel, please re-open if I'm incorrect.
> Avoid running out of threads when connecting to the DC from a slave to pull down missing data
> ---------------------------------------------------------------------------------------------
>
> Key: WFCORE-398
> URL: https://issues.jboss.org/browse/WFCORE-398
> Project: WildFly Core
> Issue Type: Feature Request
> Components: Domain Management
> Reporter: Kabir Khan
> Assignee: Emanuel Muckenhuber
> Priority: Blocker
>
> For WFLY-259 when a slave connects to the DC to pull down missing data, it does this by either getting a lock for the DC, or by joining the permit of the existing DC lock if the request to update a slave's server-config was executed as part of a composite obtaining a lock on the DC.
> The way it works at present there is a thread per slave which is blocked until the transaction completes. The DC threads are a finite resource, so a large number of slaves trying to pull down dats will cause deadlock
--
This message was sent by Atlassian JIRA
(v6.3.8#6338)
10 years, 1 month