[jboss-jira] [JBoss JIRA] (WFCORE-4152) HC cannot connect to DC after lost connect with error "WFLYCTL0332: Permission denied\"

Yeray Borges (Jira) issues at jboss.org
Fri Oct 5 11:28:00 EDT 2018


     [ https://issues.jboss.org/browse/WFCORE-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yeray Borges updated WFCORE-4152:
---------------------------------
    Steps to Reproduce: 
What we need to reproduce the issue is a domain mode environment with a master and slave where the slave is using an RBAC user for its authentication. We have to have a reconnection with a model out of sync, that can be achieved setting the DC in admin-only mode, executing a management operation that affects to the HC or the Servers, bringing back the DC. The HC is unable to connect in that scenario.

# Create a management user which will be used for DC / HC authentication
./bin/add-user.sh -u admin -p admin -ds
# Edit host-slave.xml and :
#* Replace the existing secret the for ManagementRealm security with the one generated for the user admin
#* Add the attribute username="admin" in the domain-controller/remote endpoint
# Start the DC: bin/domain.sh --host-config=host-master.xml
# Start the HC: bin/domain.sh --host-config=host-slave.xml -Djboss.domain.master.address=127.0.0.1 -Djboss.management.native.port=19999 -Djboss.domain.base.dir=slave
# Enable RBAC for the user 'admin':
{noformat}
/core-service=management/access=authorization:write-attribute(name=provider,value=rbac)
/core-service=management/access=authorization/role-mapping=SuperUser/include=ManagementRealm:add(name=admin,type=USER)
{noformat}
# Remove the local authentication:
{noformat}
/host=master/core-service=management/security-realm=ManagementRealm/authentication=local:remove
/host=slave/core-service=management/security-realm=ManagementRealm/authentication=local:remove
{noformat}
# Restart HC and DC
# Reload the DC in admin-only mode
{noformat}
reload --host=master --admin-only
{noformat}
# Change the domain model, for example modifying the jvm configuration used in a server group
{noformat}
/server-group=main-server-group/jvm=default:write-attribute(name=heap-size, value=500m)
{noformat}
# Reload the DC
{noformat}
reload --host=master --admin-only
{noformat}


These messages are shown in the DC:

{noformat}
[Host Controller] 13:07:17,931 INFO  [org.jboss.as.protocol] (management I/O-2) WFLYPRT0057:  cancelled task by interrupting thread Thread[Host Controller Service Threads - 13,5,Host Controller Service Threads]
{noformat}



These messages are shown in the HC:

{noformat}
13:21:05,009 ERROR [org.jboss.as.host.controller] (Host Controller Service Threads - 9) WFLYHC0143: Failed to apply domain-wide configuration from master host controller. Operation outcome: failed. Failure description "WFLYCTL0313: Unauthorized to execute operation 'server-set-restart-required' for resource '[]' -- \"WFLYCTL0332: Permission denied\""
13:21:05,012 WARN  [org.jboss.as.host.controller] (Host Controller Service Threads - 3) WFLYHC0146: Could not discover master using discovery option StaticDiscovery{protocol=remote,host=127.0.0.1,port=9999}. Error was: 1-$-
13:21:05,012 WARN  [org.jboss.as.host.controller] (Host Controller Service Threads - 3) WFLYHC0147: No domain controller discovery options remain.
13:21:06,015 INFO  [org.jboss.as.host.controller] (Host Controller Service Threads - 3) WFLYHC0150: Trying to reconnect to master host controller.
{noformat}


These messages in the server-one:
{noformat}
13:21:04,829 ERROR [org.jboss.as.controller.management-operation] (ServerService Thread Pool -- 67) WFLYCTL0013: Operation ("server-set-reload-required") failed - address: ([]) - failure description: "WFLYCTL0313: Unauthorized to execute operation 'server-set-reload-required' for resource '[]' -- \"WFLYCTL0332: Permission denied\""
13:21:05,006 ERROR [org.jboss.as.controller.management-operation] (ServerService Thread Pool -- 67) WFLYCTL0013: Operation ("server-set-restart-required") failed - address: ([]) - failure description: "WFLYCTL0313: Unauthorized to execute operation 'server-set-restart-required' for resource '[]' -- \"WFLYCTL0332: Permission denied\""
{noformat}

  was:
What we need to reproduce the issue is a domain mode environment with a master and slave where the slave is using an RBAC user for its authentication. We have to have a reconnection with a model out of sync, that can be archieved putting the DC in 
# Create a management user which will be used for DC / HC authentication
./bin/add-user.sh -u admin -p admin -g management -ds
# Edit host-slave.xml and :
#* Replace the existing secret the for ManagementRealm security with the one generated for the user admin
	#* Add the attribute username="admin" in the domain-controller/remote endpoint
# Start the DC: bin/domain.sh --host-config=host-master.xml
# Start the HC: bin/domain.sh --host-config=host-slave.xml -Djboss.domain.master.address=127.0.0.1 -Djboss.management.native.port=19999 -Djboss.domain.base.dir=slave
# Enable RBAC for the user 'admin':
{noformat}
/core-service=management/access=authorization:write-attribute(name=provider,value=rbac)
/core-service=management/access=authorization/role-mapping=SuperUser/include=ManagementRealm:add(name=admin,type=USER)
{noformat}
# Restart HC
# This step is not required if you are using a different machine for DC and HC. In a single, it allow us to to force the uses of EXTENAL authentication mchanism instead of JBOSS-LOCAL-AUTH. Configure SSL for the Management interface.
# Restart DC and HC
# Force a disconnection of the HC stopping the process
{noformat}
 ps -fea | grep 'host-slave' | grep 'Host Controller' | awk '{print $2}' | xargs kill -STOP
{noformat}
# After some seconds this error is displayed in th DC log:
{noformat}
[Host Controller] 13:04:53,840 WARN  [org.jboss.as.domain.controller] (management task-6) WFLYHC0030: Connection to remote host "slave" closed unexpectedly
{noformat}
# Change the domain model, for example modifying the jvm configuration used in a server group
{noformat}
/server-group=main-server-group/jvm=default:write-attribute(name=heap-size, value=500m)
{noformat}
# Send the continue signal to the HC process
{noformat}
ps -fea | grep 'host-slave' | grep 'Host Controller' | awk '{print $2}' | xargs kill -CONT
{noformat}


These messages are shown in the DC:

{noformat}
[Host Controller] 13:07:17,931 INFO  [org.jboss.as.protocol] (management I/O-2) WFLYPRT0057:  cancelled task by interrupting thread Thread[Host Controller Service Threads - 13,5,Host Controller Service Threads]
{noformat}



These messages are shown in the HC:

{noformat}
13:21:05,009 ERROR [org.jboss.as.host.controller] (Host Controller Service Threads - 9) WFLYHC0143: Failed to apply domain-wide configuration from master host controller. Operation outcome: failed. Failure description "WFLYCTL0313: Unauthorized to execute operation 'server-set-restart-required' for resource '[]' -- \"WFLYCTL0332: Permission denied\""
13:21:05,012 WARN  [org.jboss.as.host.controller] (Host Controller Service Threads - 3) WFLYHC0146: Could not discover master using discovery option StaticDiscovery{protocol=remote,host=127.0.0.1,port=9999}. Error was: 1-$-
13:21:05,012 WARN  [org.jboss.as.host.controller] (Host Controller Service Threads - 3) WFLYHC0147: No domain controller discovery options remain.
13:21:06,015 INFO  [org.jboss.as.host.controller] (Host Controller Service Threads - 3) WFLYHC0150: Trying to reconnect to master host controller.
{noformat}


These messages in the server-one:
{noformat}
13:21:04,829 ERROR [org.jboss.as.controller.management-operation] (ServerService Thread Pool -- 67) WFLYCTL0013: Operation ("server-set-reload-required") failed - address: ([]) - failure description: "WFLYCTL0313: Unauthorized to execute operation 'server-set-reload-required' for resource '[]' -- \"WFLYCTL0332: Permission denied\""
13:21:05,006 ERROR [org.jboss.as.controller.management-operation] (ServerService Thread Pool -- 67) WFLYCTL0013: Operation ("server-set-restart-required") failed - address: ([]) - failure description: "WFLYCTL0313: Unauthorized to execute operation 'server-set-restart-required' for resource '[]' -- \"WFLYCTL0332: Permission denied\""
{noformat}

*Note about reproduced steps in wildfly-core*
The previous steps are valid to reproduce the issue if the HC connects to DC using remote protocol. In newer versions, we use by default remote+http for such connection, and it is more difficult to reproduce the bug using an HC process halt. To reproduce with the same steps, we should use remote protocol or we can simulate a DC and HC models out of sync, for example, we can stop the DC, change manually the heap size of the default JVM in a server-group and start the DC again. When the HC tries to reconnect, it will try to sync the domain model, and the issue will be reproduced.



> HC cannot connect to DC after lost connect with error "WFLYCTL0332: Permission denied\"
> ---------------------------------------------------------------------------------------
>
>                 Key: WFCORE-4152
>                 URL: https://issues.jboss.org/browse/WFCORE-4152
>             Project: WildFly Core
>          Issue Type: Bug
>          Components: Security
>         Environment: -- EAP 7.1.2 Domain mode
>            Reporter: Yeray Borges
>            Assignee: Yeray Borges
>            Priority: Major
>
> Customer has domain mode, they have the following enabled
> - RBAC
> - Management realm with ssl and ldap
> When HC is disconnected from the DC due to bad GC performance, it then cannot connect to the DC with the following errors
> 2018-08-15 04:30:19,035 WARN  [org.jboss.as.host.controller] (management task-3) WFLYHC0015: Connection to remote host-controller closed. Trying to reconnect.
> 2018-08-15 04:30:19,036 INFO  [org.jboss.as.host.controller] (Host Controller Service Threads - 149) WFLYHC0150: Trying to reconnect to master host controller.
> 2018-08-15 04:30:21,006 ERROR [org.jboss.as.host.controller] (Host Controller Service Threads - 151) WFLYHC0143: Failed to apply domain-wide configuration from master host controller. Operation outcome: failed. Failure description "WFLYCTL0313: Unauthorized to execute operation 'server-set-reload-required' for resource '[]' -- \"WFLYCTL0332: Permission denied\""
> due to this , We are not able to restart any JVMs in this domain. The only way we could recover was to restart all DC/HC & JVMs. I have collected the logs and config files for DC/HC/JVM and I am uploading it to the case. Please review and let us know what is the root cause of this issue and what can be done to prevent it. 
> There is a management operation is requires reload in the log.



--
This message was sent by Atlassian Jira
(v7.12.1#712002)


More information about the jboss-jira mailing list