[Hawkular-dev] scope of the agent design

Fri Mar 13 03:57:15 EDT 2015

On 03/13/2015 03:56 AM, John Sanda wrote:
> I think we need to consider having the following types of agents
>
> * embedded, in-process
> * co-located in separate process
> * agent-less where we do management/monitoring from the server side
+1

But that opens a can of worms and mix of options...

For uptime/downtime for instance, the embedded in-process, can only say 
"I'm up", if not up -> it's down or unknown (network issue ?). From a 
separate process you can tell if it's down, separate process is better 
in that case (but it may not be installed, so do we fallback on embedded 
process info ?).

Agent-less works *if* the network is open enough to allow it...

Also for embedded one, we may be bound to product releases, unless we 
instrument ourself the server and update as we wish.

Thomas

>
> There are different scenarios in which each of the above have advantages. For (only) collecting metrics, an embedded collector would be best as it involves the least overhead. Based on our experience of managing Cassandra in RHQ, I think that there are a lot of situations where agent-less makes things much easier. Consider running repair. It is a cluster-wide operation. In RHQ, we go through a complex workflow of server to agent to Cassandra node to agent to server to next agent to next Cassandra node to agent to server and so on. Managing the repair directly from server to Cassandra would make things a lot easier.
>
> With respect to monitoring, the agent, particularly one that runs co-located in a separate process, should be capable of being more than just a collector. I am seeing that more and more with other systems I am looking at. With an Open TSDB extension for example, the agent/client side piece can convert data into a blob format that is optimized for storage. This blob conversion can be done on the server side but it causes a hit on performance. I believe that the DataDog agent is capable of performing client-side aggregations, like computing percentiles. This could be efficient and useful for computing method execution times, response times, etc.
>
> With respect to alerting, I think it is something we should consider for an agent that runs its own process. The agent could push alerts onto the bus for instance. And if we look more closely at ISPN, we could do the same with distributed caches. The agent fires an alert by putting an entry into its alert cache to which the server is subscribed for notifications. This could also be used for the agent to obtain alert definitions. Something to consider...
>> On Mar 12, 2015, at 3:57 PM, Heiko W.Rupp <hrupp at redhat.com> wrote:
>>
>> On 12 Mar 2015, at 19:34, John Mazzitelli wrote:
>>
>>> Can someone give me a quick summary of what Wildfly-Monitor does?
>>> https://github.com/hawkular/wildfly-monitor
>> http://pilhuhn.blogspot.de/2014/10/wildfly-subsystem-for-rhq-metrics_10.html
>>
>> The current one is an improved version where Heiko Braun and Harald Pehl
>> have contributed some local scheduling and other stuff.
>> I think this needs some more love to get it into a state where it can be
>> used with
>> Hawkular.
>> _______________________________________________
>> hawkular-dev mailing list
>> hawkular-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/hawkular-dev
>
> _______________________________________________
> hawkular-dev mailing list
> hawkular-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hawkular-dev