[jboss-jira] [JBoss JIRA] (WFCORE-4519) Slave Host Controller deployment repository is cleaned after a full deployment replacement

Yeray Borges (Jira) issues at jboss.org
Wed Jun 12 05:04:01 EDT 2019


     [ https://issues.jboss.org/browse/WFCORE-4519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yeray Borges updated WFCORE-4519:
---------------------------------
    Workaround Description: 
As a workaround to avoid the possibility to hit the issue, we can use the server-group:replace-deployment operation instead of deploy --force to update content in the server groups. For example:


{noformat}
[domain at localhost:9990 /] deploy /applications/test-application.war --name=test-application-v2.war --disabled

[domain at localhost:9990 /] /server-group=main-server-group:replace-deployment(name=test-application-v2.war, runtime-name=test-application.war, to-replace=test-application.war)
{noformat}

                Workaround: Workaround Exists


> Slave Host Controller deployment repository is cleaned after a full deployment replacement
> ------------------------------------------------------------------------------------------
>
>                 Key: WFCORE-4519
>                 URL: https://issues.jboss.org/browse/WFCORE-4519
>             Project: WildFly Core
>          Issue Type: Bug
>          Components: Management
>    Affects Versions: 9.0.1.Final
>         Environment: Domain mode, slave HC with deployments in its server groups
>            Reporter: Yeray Borges
>            Assignee: Yeray Borges
>            Priority: Major
>
> In domain mode, there is a cleanup task that removes obsolete content from the deployment repository of each process (DC, slave HC, and servers). By default, this task is executed every five minutes. 
> The task checks if there is any content to be marked as obsolete, if there is, it is marked and deleted on the next task execution.
> Deployment content is considerate obsolete in a slave HC if there are no references to it, that means if there is no server group that has this deployment configured.
> The issue here is the deployment handler that replaces the deployment content in a slave is not adding a reference to the new content if there are affected server groups. 
> The consequence is the cleanup task could delete the slave HC content. If this occurs when the servers are starting, the servers could fail to start with the following error:
> {noformat}
> 019-06-12 08:51:32,813 ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) WFLYCTL0013: Operation ("add") failed - address: ([("deployment" => "test-application.war")]) - failure description: "WFLYSRV0137: No deployment content with hash b1fb3b872b3490bbdbd152bd082791b1f170397d is available in the deployment content repository for deployment 'test-application.war'. This is a fatal boot error. To correct the problem, either restart with the --admin-only switch set and use the CLI to install the missing content or remove it from the configuration, or remove the deployment from the xml configuration file and restart."
> 2019-06-12 08:51:32,817 FATAL [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0056: Server boot has failed in an unrecoverable manner; exiting. See previous messages for details.
> 2019-06-12 08:51:32,833 INFO  [org.jboss.as] (MSC service thread 1-4) WFLYSRV0050: WildFly Full 17.0.0.Final-SNAPSHOT (WildFly Core 9.0.1.Final-SNAPSHOT) stopped in 5ms
> {noformat}
> The issue is difficult to hit because it is the server who requests the required files to the slave HC. In order to reproduce it, there must be a coincidence when the server has requested a deployment file to its HC, the  HC already has this file in its deployment repository marked as obsolete and, before send it to the server, the cleanup task removes it. 



--
This message was sent by Atlassian Jira
(v7.12.1#712002)


More information about the jboss-jira mailing list