[wildfly-dev] embedded RHQ agent subsystem into WildFly

Fri Feb 7 14:07:35 EST 2014

On Feb 7, 2014, at 12:02 PM, Brian Stansberry <brian.stansberry at redhat.com> wrote:

> Thanks, John.
> 
> I think we need to thoroughly work through the requirements and all the 
> expected behavior, and then start worrying about the implementation issues.

Agreed. 

-snip-

> 
>> The thought is to embed a RHQ Agent in a running WildFly instance, specifically a Host Controller. In the standalone RHQ use-case, a single RHQ Agent process runs on a box and manages all the things on that box. Since there is one Host Controller on a box, it seems to make sense to embed a RHQ Agent there. This is something we can discuss (do we embed in Process Controller instead? The initial thought was no, probably not a good idea).
>> 
> 
> Right. The PC is only meant to control the lifecycle of the other 
> processes, with as little complexity as possible. Reliability is 
> critical so we don't want complexity.
> 
> We separated the PC from the HC to reduce the risk that a problem in the 
> complex HC would cause it to fail and leave nothing consuming the 
> stdout/err streams of the server processes. We didn't want "agent" 
> failures affecting end-user request handling.

There is another reason as well which is just that forking processes in random Java threads has various reliability issues.

> 
> The java.lang.ProcessBuilder.Redirect stuff in JDK 7 *may* eliminate the 
> need for a separate PC in the future. But if we want this in EAP 6.x we 
> can't assume JDK 7.
> 
> A general notion way back at the start of AS7 was the PC might also make 
> doing patching easier. As we tackle domain-controller-coordinated 
> patching this year we'll see if that proves true. I have some doubts now.
> 
> 
>> This would allow a person to "flip a switch" to enable RHQ management of their WildFly infrastructure running on that box without having to separately install a RHQ agent on their own.
>> 

-snip-

> 
> There is presently no extensibility mechanism for the HC process. We are 
> going to have to create one for other reasons anyway, so we should just 
> assume that the HC will be extensible.
> 
> I'm sure it will work similarly to server process extensions, with a 
> very very high % of code reuse for stuff that can be both a server 
> extension and an HC extension.

Another issue is that the HC has requirements about having low memory overhead, not being chatty upstream, and have good reliability and so on. These requirements were intended to avoid one of the biggest complaints in our competitor’s products, which is that the agent is an out of control beast and a scalability limit. Now I’m not saying that RHQ doesn’t meet these requirements as well, its just something we have to verify, as IMO they out-way the perceived social benefits that come from bolting two processes together.

Another option I think we should look at is what functionality the RHQ agent provides beyond the HC when RHQ is managing WF derived product nodes. If RHQ has to talk to the WF HC anyway, and it just needs a few extra things we could just add those in. For example I know the RHQ agent collects OS data. That would be a very small enhancement to add, and you start to gain real benefits of merging the two processes because the overall footprint matches that of one process (vs two processes that are just sharing the same PID).

--
Jason T. Greene
WildFly Lead / JBoss EAP Platform Architect
JBoss, a division of Red Hat