[
https://issues.jboss.org/browse/WFCORE-4152?page=com.atlassian.jira.plugi...
]
Yeray Borges updated WFCORE-4152:
---------------------------------
Steps to Reproduce:
What we need to reproduce the issue is a domain mode environment with a master and slave
where the slave is using an RBAC user for its authentication. We have to have a
reconnection with a model out of sync, that can be achieved setting the DC in admin-only
mode, executing a management operation that affects to the HC or the Servers, bringing
back the DC. The HC is unable to connect in that scenario.
# Create a management user which will be used for DC / HC authentication
./bin/add-user.sh -u admin -p admin -ds
# Edit host-slave.xml and :
#* Replace the existing secret the for ManagementRealm security with the one generated for
the user admin
#* Add the attribute username="admin" in the domain-controller/remote endpoint
# Start the DC: bin/domain.sh --host-config=host-master.xml
# Start the HC: bin/domain.sh --host-config=host-slave.xml
-Djboss.domain.master.address=127.0.0.1 -Djboss.management.native.port=19999
-Djboss.domain.base.dir=slave
# Enable RBAC for the user 'admin':
{noformat}
/core-service=management/access=authorization:write-attribute(name=provider,value=rbac)
/core-service=management/access=authorization/role-mapping=SuperUser/include=ManagementRealm:add(name=admin,type=USER)
{noformat}
# Remove the local authentication:
{noformat}
/host=master/core-service=management/security-realm=ManagementRealm/authentication=local:remove
/host=slave/core-service=management/security-realm=ManagementRealm/authentication=local:remove
{noformat}
# Restart HC and DC
# Reload the DC in admin-only mode
{noformat}
reload --host=master --admin-only
{noformat}
# Change the domain model, for example modifying the jvm configuration used in a server
group
{noformat}
/server-group=main-server-group/jvm=default:write-attribute(name=heap-size, value=500m)
{noformat}
# Reload the DC
{noformat}
reload --host=master --admin-only
{noformat}
These messages are shown in the DC:
{noformat}
[Host Controller] 13:07:17,931 INFO [org.jboss.as.protocol] (management I/O-2)
WFLYPRT0057: cancelled task by interrupting thread Thread[Host Controller Service Threads
- 13,5,Host Controller Service Threads]
{noformat}
These messages are shown in the HC:
{noformat}
13:21:05,009 ERROR [org.jboss.as.host.controller] (Host Controller Service Threads - 9)
WFLYHC0143: Failed to apply domain-wide configuration from master host controller.
Operation outcome: failed. Failure description "WFLYCTL0313: Unauthorized to execute
operation 'server-set-restart-required' for resource '[]' --
\"WFLYCTL0332: Permission denied\""
13:21:05,012 WARN [org.jboss.as.host.controller] (Host Controller Service Threads - 3)
WFLYHC0146: Could not discover master using discovery option
StaticDiscovery{protocol=remote,host=127.0.0.1,port=9999}. Error was: 1-$-
13:21:05,012 WARN [org.jboss.as.host.controller] (Host Controller Service Threads - 3)
WFLYHC0147: No domain controller discovery options remain.
13:21:06,015 INFO [org.jboss.as.host.controller] (Host Controller Service Threads - 3)
WFLYHC0150: Trying to reconnect to master host controller.
{noformat}
These messages in the server-one:
{noformat}
13:21:04,829 ERROR [org.jboss.as.controller.management-operation] (ServerService Thread
Pool -- 67) WFLYCTL0013: Operation ("server-set-reload-required") failed -
address: ([]) - failure description: "WFLYCTL0313: Unauthorized to execute operation
'server-set-reload-required' for resource '[]' -- \"WFLYCTL0332:
Permission denied\""
13:21:05,006 ERROR [org.jboss.as.controller.management-operation] (ServerService Thread
Pool -- 67) WFLYCTL0013: Operation ("server-set-restart-required") failed -
address: ([]) - failure description: "WFLYCTL0313: Unauthorized to execute operation
'server-set-restart-required' for resource '[]' -- \"WFLYCTL0332:
Permission denied\""
{noformat}
was:
What we need to reproduce the issue is a domain mode environment with a master and slave
where the slave is using an RBAC user for its authentication. We have to have a
reconnection with a model out of sync, that can be archieved putting the DC in
# Create a management user which will be used for DC / HC authentication
./bin/add-user.sh -u admin -p admin -g management -ds
# Edit host-slave.xml and :
#* Replace the existing secret the for ManagementRealm security with the one generated for
the user admin
#* Add the attribute username="admin" in the domain-controller/remote endpoint
# Start the DC: bin/domain.sh --host-config=host-master.xml
# Start the HC: bin/domain.sh --host-config=host-slave.xml
-Djboss.domain.master.address=127.0.0.1 -Djboss.management.native.port=19999
-Djboss.domain.base.dir=slave
# Enable RBAC for the user 'admin':
{noformat}
/core-service=management/access=authorization:write-attribute(name=provider,value=rbac)
/core-service=management/access=authorization/role-mapping=SuperUser/include=ManagementRealm:add(name=admin,type=USER)
{noformat}
# Restart HC
# This step is not required if you are using a different machine for DC and HC. In a
single, it allow us to to force the uses of EXTENAL authentication mchanism instead of
JBOSS-LOCAL-AUTH. Configure SSL for the Management interface.
# Restart DC and HC
# Force a disconnection of the HC stopping the process
{noformat}
ps -fea | grep 'host-slave' | grep 'Host Controller' | awk '{print
$2}' | xargs kill -STOP
{noformat}
# After some seconds this error is displayed in th DC log:
{noformat}
[Host Controller] 13:04:53,840 WARN [org.jboss.as.domain.controller] (management task-6)
WFLYHC0030: Connection to remote host "slave" closed unexpectedly
{noformat}
# Change the domain model, for example modifying the jvm configuration used in a server
group
{noformat}
/server-group=main-server-group/jvm=default:write-attribute(name=heap-size, value=500m)
{noformat}
# Send the continue signal to the HC process
{noformat}
ps -fea | grep 'host-slave' | grep 'Host Controller' | awk '{print
$2}' | xargs kill -CONT
{noformat}
These messages are shown in the DC:
{noformat}
[Host Controller] 13:07:17,931 INFO [org.jboss.as.protocol] (management I/O-2)
WFLYPRT0057: cancelled task by interrupting thread Thread[Host Controller Service Threads
- 13,5,Host Controller Service Threads]
{noformat}
These messages are shown in the HC:
{noformat}
13:21:05,009 ERROR [org.jboss.as.host.controller] (Host Controller Service Threads - 9)
WFLYHC0143: Failed to apply domain-wide configuration from master host controller.
Operation outcome: failed. Failure description "WFLYCTL0313: Unauthorized to execute
operation 'server-set-restart-required' for resource '[]' --
\"WFLYCTL0332: Permission denied\""
13:21:05,012 WARN [org.jboss.as.host.controller] (Host Controller Service Threads - 3)
WFLYHC0146: Could not discover master using discovery option
StaticDiscovery{protocol=remote,host=127.0.0.1,port=9999}. Error was: 1-$-
13:21:05,012 WARN [org.jboss.as.host.controller] (Host Controller Service Threads - 3)
WFLYHC0147: No domain controller discovery options remain.
13:21:06,015 INFO [org.jboss.as.host.controller] (Host Controller Service Threads - 3)
WFLYHC0150: Trying to reconnect to master host controller.
{noformat}
These messages in the server-one:
{noformat}
13:21:04,829 ERROR [org.jboss.as.controller.management-operation] (ServerService Thread
Pool -- 67) WFLYCTL0013: Operation ("server-set-reload-required") failed -
address: ([]) - failure description: "WFLYCTL0313: Unauthorized to execute operation
'server-set-reload-required' for resource '[]' -- \"WFLYCTL0332:
Permission denied\""
13:21:05,006 ERROR [org.jboss.as.controller.management-operation] (ServerService Thread
Pool -- 67) WFLYCTL0013: Operation ("server-set-restart-required") failed -
address: ([]) - failure description: "WFLYCTL0313: Unauthorized to execute operation
'server-set-restart-required' for resource '[]' -- \"WFLYCTL0332:
Permission denied\""
{noformat}
*Note about reproduced steps in wildfly-core*
The previous steps are valid to reproduce the issue if the HC connects to DC using remote
protocol. In newer versions, we use by default remote+http for such connection, and it is
more difficult to reproduce the bug using an HC process halt. To reproduce with the same
steps, we should use remote protocol or we can simulate a DC and HC models out of sync,
for example, we can stop the DC, change manually the heap size of the default JVM in a
server-group and start the DC again. When the HC tries to reconnect, it will try to sync
the domain model, and the issue will be reproduced.
HC cannot connect to DC after lost connect with error
"WFLYCTL0332: Permission denied\"
---------------------------------------------------------------------------------------
Key: WFCORE-4152
URL:
https://issues.jboss.org/browse/WFCORE-4152
Project: WildFly Core
Issue Type: Bug
Components: Security
Environment: -- EAP 7.1.2 Domain mode
Reporter: Yeray Borges
Assignee: Yeray Borges
Priority: Major
Customer has domain mode, they have the following enabled
- RBAC
- Management realm with ssl and ldap
When HC is disconnected from the DC due to bad GC performance, it then cannot connect to
the DC with the following errors
2018-08-15 04:30:19,035 WARN [org.jboss.as.host.controller] (management task-3)
WFLYHC0015: Connection to remote host-controller closed. Trying to reconnect.
2018-08-15 04:30:19,036 INFO [org.jboss.as.host.controller] (Host Controller Service
Threads - 149) WFLYHC0150: Trying to reconnect to master host controller.
2018-08-15 04:30:21,006 ERROR [org.jboss.as.host.controller] (Host Controller Service
Threads - 151) WFLYHC0143: Failed to apply domain-wide configuration from master host
controller. Operation outcome: failed. Failure description "WFLYCTL0313: Unauthorized
to execute operation 'server-set-reload-required' for resource '[]' --
\"WFLYCTL0332: Permission denied\""
due to this , We are not able to restart any JVMs in this domain. The only way we could
recover was to restart all DC/HC & JVMs. I have collected the logs and config files
for DC/HC/JVM and I am uploading it to the case. Please review and let us know what is the
root cause of this issue and what can be done to prevent it.
There is a management operation is requires reload in the log.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)