[jboss-jira] [JBoss JIRA] (WFCORE-1418) Reloading host-controller via http-api puts the HC into unresponsive state

Wed Apr 20 17:14:00 EDT 2016

    [ https://issues.jboss.org/browse/WFCORE-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194815#comment-13194815 ] 

Roy Willemse commented on WFCORE-1418:
--------------------------------------

I've seen similar behaviour when reloading Wildfly 10.0.0.Final when running inside a Docker container.

- Reload via JBoss-CLI works fine
- Problem could not be reproduced running locally (on OS X) or CentOS 7 running on VSphere

{code}
docker run -it java # I also tried centos:7 with OpenJDK 8
# Inside the container:
WILDFLY_VERSION=10.0.0.Final
curl http://download.jboss.org/wildfly/$WILDFLY_VERSION/wildfly-$WILDFLY_VERSION.tar.gz | tar -C ~ -zx
cd ~/wildfly-$WILDFLY_VERSION
bin/add-user.sh admin admin --silent
bin/standalone.sh &
curl --digest -L -D - http://localhost:9990/management --header "Content-Type: application/json" -d '{"operation":"reload","name":"","json.pretty":1}' -u admin:admin
{code}

After 5 minutes Wildfly logs an ERROR and starts normally:

{code}
19:21:50,837 INFO  [org.jboss.as.connector.subsystems.datasources] (MSC service thread 1-2) WFLYJCA0010: Unbound data source [java:jboss/datasources/ExampleDS]
19:21:50,847 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-2) WFLYUT0019: Host default-host stopping
19:21:50,857 INFO  [org.jboss.as.connector.deployers.jdbc] (MSC service thread 1-1) WFLYJCA0019: Stopped Driver service with driver-name = h2
19:21:50,874 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-1) WFLYUT0008: Undertow HTTP listener default suspending
19:21:50,875 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-1) WFLYUT0007: Undertow HTTP listener default stopped, was bound to 172.17.0.2:8080
19:21:50,877 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-1) WFLYUT0004: Undertow 1.3.15.Final stopping

# After 5 minutes:

19:26:50,529 ERROR [org.jboss.as.controller.management-operation] (management task-10) WFLYCTL0349: Timeout after [300] seconds waiting for service container stability while finalizing an operation. Process must be restarted. Step that first updated the service container was 'reload' at address '[]'
19:26:50,555 INFO  [org.jboss.as.mail.extension] (MSC service thread 1-1) WFLYMAIL0002: Unbound mail session [java:jboss/mail/Default]
19:26:50,568 INFO  [org.jboss.as] (MSC service thread 1-1) WFLYSRV0050: WildFly Full 10.0.0.Final (WildFly Core 2.0.10.Final) stopped in 300062ms
19:26:50,573 INFO  [org.jboss.as] (MSC service thread 1-1) WFLYSRV0049: WildFly Full 10.0.0.Final (WildFly Core 2.0.10.Final) starting
{code}

> Reloading host-controller via http-api puts the HC into unresponsive state
> --------------------------------------------------------------------------
>
>                 Key: WFCORE-1418
>                 URL: https://issues.jboss.org/browse/WFCORE-1418
>             Project: WildFly Core
>          Issue Type: Bug
>          Components: Domain Management
>    Affects Versions: 2.0.10.Final
>            Reporter: Tomaz Cerar
>            Assignee: Tomaz Cerar
>            Priority: Blocker
>             Fix For: 2.1.0.Final
>
>
> Reloading host-controller via http-api puts the HC into unresponsive state.
> *reproduce*
>  \- create an administrative user admin:asdasd at 2
>  \- start a domain
>  \- reload a server via http api
> {noformat}
> curl --digest -L -D - http://localhost:9990/management --header "Content-Type: application/json" -d '{"operation":"reload","name":"", "address":{"host" : "master"},"json.pretty":1}' -u admin:asdasd at 2
> {noformat}
> *actual*
> Default server instances are stopped, HC is left in unresponsive state.
> Keeping the domain alive, following message will appear in 5 minutes, domain will become responsive again after that.
> {noformat}
> [Host  Controller] 04:47:23,966 ERROR  [org.jboss.as.controller.management-operation] (management task-7)  WFLYCTL0349: Timeout after [300] seconds waiting for service container  stability while finalizing an operation. Process must be restarted. Step  that first updated the service container was 'reload' at address  '[("host" => "master")]'
> {noformat}
> *expected*
> Domain is reloaded
> *additional info*
> The issue was introduced by fix for JBEAP-2751 - https://github.com/jbossas/wildfly-core-eap/commit/4986773a51fbf43ad911aecc403f7eb90e72a8c2
> thread dump of unresponsive HC
> http://pastebin.test.redhat.com/348732
> I am unable to reproduce locally, but issue can be easily reproduced on slower servers in MWQE lab. SSLMasterSlave*WayTestCase using reload via http-api cousing failures in domain modules of wf-core testsuite (e.g. [eap-7x-as-testsuite-test-core-rhel|https://url.corp.redhat.com/9f1f544] )
> Regression against 7.0.0.ER4. I was able to reproduce with the latest wildfly-core bits as well (1be598e)

--
This message was sent by Atlassian JIRA
(v6.4.11#64026)