[
https://issues.jboss.org/browse/WFCORE-1791?page=com.atlassian.jira.plugi...
]
Brian Stansberry commented on WFCORE-1791:
------------------------------------------
Thanks, Yeray!
The bit I wasn't seeing is that ServerStartTask, which is only run once per server
process and isn't re-run in a reload, passes the DomainServerCommunicationServices
into the Bootstrap.bootstap method (via the List<ServiceActivator> param, which in
turn passes it to ApplicationServerService. That means that when the reload triggers a new
start() of ApplicationServerService, DomainServerCommunicationServices.activate gets run
again and the new initialOperationId value is used.
I mistakenly thought DomainServerCommunicationServices.activate was only called once in
the lifetime of a server process.
Strange operation-id handling in domain server reload execution
---------------------------------------------------------------
Key: WFCORE-1791
URL:
https://issues.jboss.org/browse/WFCORE-1791
Project: WildFly Core
Issue Type: Task
Components: Domain Management
Reporter: Brian Stansberry
Assignee: Yeray Santana Borges
Priority: Minor
When the HC sends a reload op to a managed server it includes an undocumented
"operation-id" parameter. But, I don’t see how it is used with a reload. When it
was added to the code the intent clearly was that it would be used, but now at least is
not. ServerDomainProcessReloadHandler reads it from the op and sets
DomainServerCommunicationServices.initialOperationId, but that field is only read when
HostControllerConnectionService is instantiated. HostControllerConnectionService then
caches the value in a final field. A reload does not result in a new instantiation of
HostControllerConnectionService; that object is only instantiated during initial process
boot when ServerStartTask is unmarshaled from stdin and run. So changing the
DomainServerCommunicationServices.initialOperationId in a reload should do nothing.
https://github.com/wildfly/wildfly-core/commit/302949cf60823d8aa3989d74df...
is the initial commit when this update of the id in reload was added in. The intent was
that by providing this id, when the reloading server connects to the HC to get the boot
ops, that read of boot ops would be able to "join" any active operation that
triggered the reload, and thus would not have to block waiting for that operation to
complete.
Afaict, if the "blocking" param on an op like /host=x/server[-config]=y:reload
is set to 'true' the op should deadlock. On the HC, ServerReloadHandler will
acquire the exclusive lock by calling context.getServiceRegistry(true). Then
ServerInventoryImpl.reloadServer will block waiting for the server to reach STARTED state.
But the server won't reach that because it's registration request will not be able
to acquire the HC lock.
Task here is to
1) Confirm the above and then
2) Either
a) get the operation-id propagated
b) or rip the operation-id bit out of reload because investigation showed it was not
needed.
--
This message was sent by Atlassian JIRA
(v7.2.2#72004)