[JBoss JIRA] (WFCORE-1791) Strange operation-id handling in domain server reload execution

Tuesday, 25 October 2016

    [
https://issues.jboss.org/browse/WFCORE-1791?page=com.atlassian.jira.plugi...
] 

Brian Stansberry commented on WFCORE-1791:
------------------------------------------

Thanks, Yeray!

The bit I wasn't seeing is that ServerStartTask, which is only run once per server
process and isn't re-run in a reload, passes the DomainServerCommunicationServices
into the Bootstrap.bootstap method (via the List<ServiceActivator> param, which in
turn passes it to ApplicationServerService. That means that when the reload triggers a new
start() of ApplicationServerService, DomainServerCommunicationServices.activate gets run
again and the new initialOperationId value is used.

I mistakenly thought DomainServerCommunicationServices.activate was only called once in
the lifetime of a server process.

...
 Strange operation-id handling in domain server reload execution
 ---------------------------------------------------------------

                 Key: WFCORE-1791
                 URL: https://issues.jboss.org/browse/WFCORE-1791
             Project: WildFly Core
          Issue Type: Task
          Components: Domain Management
            Reporter: Brian Stansberry
            Assignee: Yeray Santana Borges
            Priority: Minor

 When the HC sends a reload op to a managed server it includes an undocumented
"operation-id" parameter. But, I don’t see how it is used with a reload. When it
was added to the code the intent clearly was that it would be used, but now at least is
not. ServerDomainProcessReloadHandler reads it from the op and sets
DomainServerCommunicationServices.initialOperationId, but that field is only read when
HostControllerConnectionService is instantiated. HostControllerConnectionService then
caches the value in a final field. A reload does not result in a new instantiation of
HostControllerConnectionService; that object is only instantiated during initial process
boot when ServerStartTask is unmarshaled from stdin and run. So changing the
DomainServerCommunicationServices.initialOperationId in a reload should do nothing.
 https://github.com/wildfly/wildfly-core/commit/302949cf60823d8aa3989d74df...
is the initial commit when this update of the id in reload was added in. The intent was
that by providing this id, when the reloading server connects to the HC to get the boot
ops, that read of boot ops would be able to "join" any active operation that
triggered the reload, and thus would not have to block waiting for that operation to
complete. 
 Afaict, if the "blocking" param on an op like /host=x/server[-config]=y:reload
is set to 'true' the op should deadlock. On the HC, ServerReloadHandler will
acquire the exclusive lock by calling context.getServiceRegistry(true). Then
ServerInventoryImpl.reloadServer will block waiting for the server to reach STARTED state.
But the server won't reach that because it's registration request will not be able
to acquire the HC lock.
 Task here is to
 1) Confirm the above and then 
 2) Either
 a) get the operation-id propagated
 b) or rip the operation-id bit out of reload because investigation showed it was not
needed. 

--
This message was sent by Atlassian JIRA
(v7.2.2#72004)

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006