[JBoss JIRA] (WFCORE-489) The "access-mechanism" field in the active-operation resources is always undefined
by Brian Stansberry (JIRA)
[ https://issues.jboss.org/browse/WFCORE-489?page=com.atlassian.jira.plugin... ]
Brian Stansberry updated WFCORE-489:
------------------------------------
Priority: Minor (was: Major)
> The "access-mechanism" field in the active-operation resources is always undefined
> ----------------------------------------------------------------------------------
>
> Key: WFCORE-489
> URL: https://issues.jboss.org/browse/WFCORE-489
> Project: WildFly Core
> Issue Type: Bug
> Components: Domain Management
> Affects Versions: 1.0.0.Alpha15
> Reporter: Brian Stansberry
> Assignee: Brian Stansberry
> Priority: Minor
> Fix For: 1.0.0.Alpha16
>
>
> The "access-mechanism" field in core-service=management/service=active-operations/active-operation=* is always undefined. The code that sets it is using the wrong var:
> {code}
> ModelNode accessMechanismNode = model.get(ACCESS_MECHANISM);
> if (accessMechanism != null) {
> accessMechanismNode.set(accessMechanismNode.toString());
> }
> {code}
> It should be accessMechanism.toString()
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
10 years, 2 months
[JBoss JIRA] (WFCORE-489) The "access-mechanism" field in the active-operation resources is always undefined
by Brian Stansberry (JIRA)
Brian Stansberry created WFCORE-489:
---------------------------------------
Summary: The "access-mechanism" field in the active-operation resources is always undefined
Key: WFCORE-489
URL: https://issues.jboss.org/browse/WFCORE-489
Project: WildFly Core
Issue Type: Bug
Components: Domain Management
Affects Versions: 1.0.0.Alpha15
Reporter: Brian Stansberry
Assignee: Brian Stansberry
Fix For: 1.0.0.Alpha16
The "access-mechanism" field in core-service=management/service=active-operations/active-operation=* is always undefined. The code that sets it is using the wrong var:
{code}
ModelNode accessMechanismNode = model.get(ACCESS_MECHANISM);
if (accessMechanism != null) {
accessMechanismNode.set(accessMechanismNode.toString());
}
{code}
It should be accessMechanism.toString()
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
10 years, 2 months
[JBoss JIRA] (WFCORE-488) Failures of undeploy of a partially failed deployment
by Arcadiy Ivanov (JIRA)
[ https://issues.jboss.org/browse/WFCORE-488?page=com.atlassian.jira.plugin... ]
Arcadiy Ivanov commented on WFCORE-488:
---------------------------------------
>> we are not actively working on further 8.x releases
If need be I'll backport and create a patch myself, not a problem.
That said, I find it peculiar that the latest stable release of the container is abandoned as soon as it is released. I understand this is a community product and RH officially provides no support but it still is targeted at the enterprise audience so even though 9.0 will be the next latest and greatest, until it is so we still need to use 8.x versions as a current platform and bugs like this have to be fixed somehow.
Maybe community patching/Service Pack effort should be initiated? I have no problem assembling my own WLFY with controlled patches applied, but it may be too cumbersome/advanced for some users.
>> I expect BlockingQueueOperationListener.operationFailed to be invoked on the DC.
{{ServerUpdatePolicy.recordServerResult(ServerIdentity, ModelNode) line: 140}} receives the failure information, but the failure information doesn't seem propagate/attach to the originating operation/step.
> Failures of undeploy of a partially failed deployment
> -----------------------------------------------------
>
> Key: WFCORE-488
> URL: https://issues.jboss.org/browse/WFCORE-488
> Project: WildFly Core
> Issue Type: Bug
> Components: CLI, Domain Management
> Reporter: Arcadiy Ivanov
> Assignee: Alexey Loubyansky
>
> {noformat}
> [domain@localhost:9990 /] undeploy 2e3b75f2-88ff-4eb9-a8b1-9107c452309e.ear --all-relevant-server-groups
> Undeploy failed: JBAS010839: Operation failed or was rolled back on all servers.
> {noformat}
> There are several issues at play here:
> # Insufficient information is displayed along with JBAS010839
> # No actual failure logged either in DC or in individual host's logs
> # Failure to find a deployment unit on a particular host should not result in a failure of a domain-wide undeploy in DeploymentUndeployHandler
> I'll start in a reverse order of causation:
> In DeploymentUndeployHandler.execute
> {noformat}
> public void execute(OperationContext context, ModelNode operation) throws OperationFailedException {
> ModelNode model = context.readResourceForUpdate(PathAddress.EMPTY_ADDRESS).getModel();
> ...
> {noformat}
> OperationContextImpl.readResourceForUpdate calls OperationContextImpl.requireChild:
> {noformat}
> private static Resource requireChild(final Resource resource, final PathElement childPath, final PathAddress fullAddress) {
> if (resource.hasChild(childPath)) {
> return resource.requireChild(childPath);
> } else {
> PathAddress missing = PathAddress.EMPTY_ADDRESS;
> for (PathElement search : fullAddress) {
> missing = missing.append(search);
> if (search.equals(childPath)) {
> break;
> }
> }
> throw ControllerMessages.MESSAGES.managementResourceNotFound(missing);
> }
> }
> {noformat}
> The exception generated by {{throw ControllerMessages.MESSAGES.managementResourceNotFound(missing);}}:
> # does not propagate from the host to DC and to CLI
> # is not logged with any level (not even TRACE) either on host or DC
> # should be handled within DeploymentUndeployHandler not to fail the undeploy operation in case of a partially failed deployment (e.g. EAR with unsatisfied CDI dependencies).
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
10 years, 2 months
[JBoss JIRA] (JBMETA-220) Support JAXB style programming for RAR module
by Jeff Zhang (JIRA)
[ https://issues.jboss.org/browse/JBMETA-220?page=com.atlassian.jira.plugin... ]
Jeff Zhang closed JBMETA-220.
-----------------------------
Resolution: Out of Date
> Support JAXB style programming for RAR module
> ---------------------------------------------
>
> Key: JBMETA-220
> URL: https://issues.jboss.org/browse/JBMETA-220
> Project: JBoss Metadata
> Issue Type: Task
> Reporter: Jesper Pedersen
> Assignee: Jeff Zhang
>
> Currently we have standard get- and set- methods, like
> @XmlElement(name="outbound-resourceadapter")
> public void setOutboundRa(OutboundRaMetaData outboundRa) {
> this.outboundRa = outboundRa;
> }
> public OutboundRaMetaData getOutboundRa() {
> return outboundRa;
> }
> We should support a JAXB style programming in the RAR module, and thereby creating any objects or lists that doesn't exists in the get-method, like
> @XmlElement(name="outbound-resourceadapter")
> public void setOutboundRa(OutboundRaMetaData outboundRa) {
> this.outboundRa = outboundRa;
> }
> public OutboundRaMetaData getOutboundRa() {
> if (outboundRa == null)
> outboundRa = new OutboundRa();
> return outboundRa;
> }
> This will allow an easier merging between the annotation model and the XML model.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
10 years, 2 months
[JBoss JIRA] (WFCORE-263) Cancelling management op on slave HC tree is broken
by Brian Stansberry (JIRA)
[ https://issues.jboss.org/browse/WFCORE-263?page=com.atlassian.jira.plugin... ]
Brian Stansberry commented on WFCORE-263:
-----------------------------------------
The 1273140711 is identified as the problematic one because it is the op that holds the exclusive write lock on the HC process, and it has been holding it for a long time. The HC is simply acting as a proxy for the 24068422 op, so there is no need for it to acquire the exclusive write lock.
> Cancelling management op on slave HC tree is broken
> ---------------------------------------------------
>
> Key: WFCORE-263
> URL: https://issues.jboss.org/browse/WFCORE-263
> Project: WildFly Core
> Issue Type: Bug
> Components: Domain Management
> Affects Versions: 1.0.0.Alpha9
> Reporter: James Livingston
> Assignee: Brian Stansberry
> Attachments: unundeployable.zip
>
>
> If you have a DC with a slave HC, and perform a management operation which gets stuck, non-progressing operations will be reported for both the DC and the slave HC via:
> /host=master/core-service=management/service=management-operations:find-non-progressing-operation
> /host=slave/core-service=management/service=management-operations:find-non-progressing-operation
> Cancelling the operation under /host=master works as expected, pushing the cancellation down to the slave and the controllers become responsive again.
> If however you attempt to cancel the operation under /host=slave, it goes bad. { "outcome" => "success", "result" => undefined } is reported in the CLI, but the controllers are still unresponsive.
> Running :find-non-progressing-operation against the slave will report the {outcome=success,result=undefined} rather than that no non-progressing operations were found, and active-operation=*:read-resource() shows it as not cancelled.
> Once you attempt to cancel it on a slave, attempting to cancel it under /host=master will report success, but leave the slave op in a weird state, and things requiring the controller lock (such as the web UI) will still not respond.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
10 years, 2 months
[JBoss JIRA] (WFCORE-263) Cancelling management op on slave HC tree is broken
by Brian Stansberry (JIRA)
[ https://issues.jboss.org/browse/WFCORE-263?page=com.atlassian.jira.plugin... ]
Brian Stansberry commented on WFCORE-263:
-----------------------------------------
Looking at the list of active operations on the slave HC confirms my expectation as to what was going on here:
{code}
[domain@localhost:9990 /] /host=slave/core-service=management/service=management-operations:read-resource(recursive=true,include-runtime=true)
{
"outcome" => "success",
"result" => {"active-operation" => {
"24068422" => {
"access-mechanism" => undefined,
"address" => [
("host" => "slave"),
("server" => "server-one")
],
"caller-thread" => "Host Controller Service Threads - 11",
"cancelled" => false,
"exclusive-running-time" => -1L,
"execution-status" => "executing",
"operation" => "composite",
"running-time" => 78500204000L
},
"1273140711" => {
"access-mechanism" => "undefined",
"address" => [],
"caller-thread" => "Host Controller Service Threads - 10",
"cancelled" => false,
"exclusive-running-time" => 78542647000L,
"execution-status" => "completing",
"operation" => "composite",
"running-time" => 78544658000L
},
"542652524" => {
"access-mechanism" => "undefined",
"address" => [
("host" => "slave"),
("core-service" => "management"),
("service" => "management-operations")
],
"caller-thread" => "Host Controller Service Threads - 17",
"cancelled" => false,
"exclusive-running-time" => -1L,
"execution-status" => "executing",
"operation" => "read-resource",
"running-time" => 4639000L
}
}}
}
{code}
The third op is just the CLI read-resource request itself, so ignore that.
The others are normal for a domain op that is being rolled out to servers and hasn't yet completed.
The 2nd one (1273140711) was actually invoked first. It was a request from the DC to the slave telling it to update its own model. It has "execution-status" => "completing" because it has prepared the update to the model and is waiting for an instruction from the DC telling it to commit the transaction. This request is actually fine.
The 1st one (24068422) is the problematic one. The DC has gotten the prepared notification from the 1273140711 op and has proceeded to roll out the change to the servers. It sends a request to the slave which they then proxy on to the servers. The slave is just acting as a proxy. This is the request that is actually stuck, as the server is not responding.
The problem is find-non-progressing-operation and cancel-non-progressing-operation are identifying the 1273140711 op as the problematic one.
{code}
[domain@localhost:9990 /] /host=slave/core-service=management/service=management-operations:find-non-progressing-operation
{
"outcome" => "success",
"result" => "1273140711"
}
{code}
Canceling that one doesn't unstick anything, as the DC is not yet blocking waiting for a response to its commit/rollback.
I'll need to give some thought as to how to get find-non-progressing-operation and cancel-non-progressing-operation to identify the 24068422 op as the problematic one, or at least to decide that they don't know which of the two is the problem, forcing the user to investigate further.
> Cancelling management op on slave HC tree is broken
> ---------------------------------------------------
>
> Key: WFCORE-263
> URL: https://issues.jboss.org/browse/WFCORE-263
> Project: WildFly Core
> Issue Type: Bug
> Components: Domain Management
> Affects Versions: 1.0.0.Alpha9
> Reporter: James Livingston
> Assignee: Brian Stansberry
> Attachments: unundeployable.zip
>
>
> If you have a DC with a slave HC, and perform a management operation which gets stuck, non-progressing operations will be reported for both the DC and the slave HC via:
> /host=master/core-service=management/service=management-operations:find-non-progressing-operation
> /host=slave/core-service=management/service=management-operations:find-non-progressing-operation
> Cancelling the operation under /host=master works as expected, pushing the cancellation down to the slave and the controllers become responsive again.
> If however you attempt to cancel the operation under /host=slave, it goes bad. { "outcome" => "success", "result" => undefined } is reported in the CLI, but the controllers are still unresponsive.
> Running :find-non-progressing-operation against the slave will report the {outcome=success,result=undefined} rather than that no non-progressing operations were found, and active-operation=*:read-resource() shows it as not cancelled.
> Once you attempt to cancel it on a slave, attempting to cancel it under /host=master will report success, but leave the slave op in a weird state, and things requiring the controller lock (such as the web UI) will still not respond.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
10 years, 2 months
[JBoss JIRA] (WFCORE-488) Failures of undeploy of a partially failed deployment
by Brian Stansberry (JIRA)
[ https://issues.jboss.org/browse/WFCORE-488?page=com.atlassian.jira.plugin... ]
Brian Stansberry commented on WFCORE-488:
-----------------------------------------
I'm sorry to say we are not actively working on further 8.x releases, so it's quite unlikely this will be fixed there.
There shouldn't be anything in DeploymentRemoveHandler$1$1.handleResult(OperationContext$ResultAction, OperationContext, ModelNode) line: 80 on the DC, as that step is the local DC execution of the remove, which doesn't have a failure.
I expect BlockingQueueOperationListener.operationFailed to be invoked on the DC.
> Failures of undeploy of a partially failed deployment
> -----------------------------------------------------
>
> Key: WFCORE-488
> URL: https://issues.jboss.org/browse/WFCORE-488
> Project: WildFly Core
> Issue Type: Bug
> Components: CLI, Domain Management
> Reporter: Arcadiy Ivanov
> Assignee: Alexey Loubyansky
>
> {noformat}
> [domain@localhost:9990 /] undeploy 2e3b75f2-88ff-4eb9-a8b1-9107c452309e.ear --all-relevant-server-groups
> Undeploy failed: JBAS010839: Operation failed or was rolled back on all servers.
> {noformat}
> There are several issues at play here:
> # Insufficient information is displayed along with JBAS010839
> # No actual failure logged either in DC or in individual host's logs
> # Failure to find a deployment unit on a particular host should not result in a failure of a domain-wide undeploy in DeploymentUndeployHandler
> I'll start in a reverse order of causation:
> In DeploymentUndeployHandler.execute
> {noformat}
> public void execute(OperationContext context, ModelNode operation) throws OperationFailedException {
> ModelNode model = context.readResourceForUpdate(PathAddress.EMPTY_ADDRESS).getModel();
> ...
> {noformat}
> OperationContextImpl.readResourceForUpdate calls OperationContextImpl.requireChild:
> {noformat}
> private static Resource requireChild(final Resource resource, final PathElement childPath, final PathAddress fullAddress) {
> if (resource.hasChild(childPath)) {
> return resource.requireChild(childPath);
> } else {
> PathAddress missing = PathAddress.EMPTY_ADDRESS;
> for (PathElement search : fullAddress) {
> missing = missing.append(search);
> if (search.equals(childPath)) {
> break;
> }
> }
> throw ControllerMessages.MESSAGES.managementResourceNotFound(missing);
> }
> }
> {noformat}
> The exception generated by {{throw ControllerMessages.MESSAGES.managementResourceNotFound(missing);}}:
> # does not propagate from the host to DC and to CLI
> # is not logged with any level (not even TRACE) either on host or DC
> # should be handled within DeploymentUndeployHandler not to fail the undeploy operation in case of a partially failed deployment (e.g. EAR with unsatisfied CDI dependencies).
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
10 years, 2 months
[JBoss JIRA] (WFCORE-464) ProcessController's BufferedReader.readLine() usage allows unbounded memory usage
by Brian Stansberry (JIRA)
[ https://issues.jboss.org/browse/WFCORE-464?page=com.atlassian.jira.plugin... ]
Brian Stansberry commented on WFCORE-464:
-----------------------------------------
https://github.com/bstansberry/wildfly-core/compare/limit-readLine is something I did on this quite a while ago, but I have no time to develop tests of this, and this would need considerable testing to make the change amount to a net reduction in risk.
> ProcessController's BufferedReader.readLine() usage allows unbounded memory usage
> ---------------------------------------------------------------------------------
>
> Key: WFCORE-464
> URL: https://issues.jboss.org/browse/WFCORE-464
> Project: WildFly Core
> Issue Type: Bug
> Components: Domain Management
> Affects Versions: 1.0.0.Alpha14
> Reporter: James Livingston
> Assignee: Brian Stansberry
>
> org.jboss.as.process.ManagedProcess$ReadTask.run() uses readLine() to read a line of output from the manage process' standard output/error streams, which cause the whole line to be loaded into memory.
> Badly written applications may dump excessive amounts of data out in a single line, which would cause the process controller to temporarily use a large amount of memory to process it, potentially leading to an OutOfMemoryError. Practically speaking, with the default -Xmx512m it would require around 128 million characters in a single line to trigger, which is obviously very high.
> Were an OOME to occur, it would almost certainly cause the stream to be closed, and "IOException: Broken pipe" exceptions to occur in the child process, which for WildFly would be caught an ignored by JBoss Logging. A hostile managed process exploiting this would be almost impossible.
> A reasonable solution would probably be to limit size of the buffer read, causing it to split lines over a certain size (a few megabytes?). That would not likely cause any practical problems.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
10 years, 2 months