[Hawkular-dev] Garbage collection of outdated Servers/Metrics - especially with (orchestrated) containers

Thu Mar 9 05:35:36 EST 2017

On Wednesday, March 8, 2017 5:54:46 PM CET Heiko W.Rupp wrote:
> Hi,
> 
> I know I brought that up in the past ... (and I am sure Juca will easily
> find where :-)
> 
> In the good old datacenter, one deployed an application server with
> application, added it to monitoring and it hummed there happily. The
> server sometimes got rebooted for OS updates or other things, but the
> app-server always stayed the same and the monitoring system knew all its
> pets by their name and the admins happily did this:
> http://www.starwars-union.de/bilder/news/20110401_sunset.jpg
> 
> Nowadays in orchestration system the situation more looks like this
> http://www.animationsfilme.ch/wp-content/uploads/2013/07/DespicableMe_01.jpg
> 
> where containers come and go and an application in container once it has
> died is not re-started but a new container with its own ID is started.
> 
> Of course we can identify applications with labels so that I don't need
> to know the container id. So I can gather and display metrics for those
> and all is fine.
> 
> But: all those containers will create new
> - metric ids
> - inventory entries
> - ???
> 
> The question is now: how long do we want/need to keep them?

Isn't the question rather "how do we associate them with the application(s)"? 
Because if we want to track e.g. CPU load generated by an application in a 
container - isn't that something users would want to look at history of? IMHO, 
using the ephemeral "container id" as (part of) metric ids is a wrong thing to 
do, because really the user isn't interested in the container itself, but the 
applications that are running in it and their consumption of container's 
resources.

> Hawkular-metrics has a TTL for the datapoints, but I think metric
> definitions are not evicted.
> Similar a container being killed can't easily tell inventory that its
> entry can go away.
> 
> For inventory-new we could use the expiration feature of
> Hawkular-metrics for datapoints, where e.g. the agent would regularly
> sync data and thus refresh the last-seen time to keep an entry "alive".
> 
> Also for the pure metrics - how much of historic data do we want/need.
> And if we would e.g. aggregate those for long(er) term storage I think
> we could perhaps actually aggregate over all individual time series over
> many parallel pods and aggregate them into one for the entire
> application.
> 
>    Heiko
> 
> _______________________________________________
> hawkular-dev mailing list
> hawkular-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hawkular-dev

-- 
Lukas Krejci