[Hawkular-dev] Inventory: transient feeds - or how to tackle the pets vs cattle scenario

Mon Oct 10 09:46:03 EDT 2016

On Tuesday, October 4, 2016 11:20:51 AM CEST Heiko W.Rupp wrote:
> Hey,
> 
> Right now we identify "agents" via their feed-id.
> An instrumented wildfly comes online, registers
> its feed with the server, sends its resource discovery
> and later metrics with the feed id.
> Over its lifecycle, the server may be stopped and re-started
> several times.
> 
> This is great in the classical use case with installations
> on tin or VMs.
> 
> In container-land especially with systems like Kubernetes,
> containers are started once and after they have died for
> whatever reason they are not restarted again.
> So the id of an individual container is less and less interesting.
> The interesting part is the overall app, that contains of many
> containers linked together with several of them representing
> an individual service of the app.
> 
> So basically we would rather need to record the app and other
> metadata for identifying individual parts of the app (e.g. the web
> servers or the data bases) and then get pointers to individual
> stuff.
> The feed would not need to survive for too long, but some of
> its collected data perhaps. And then e.g. the discovery of resources
> in a new container of the exact same type as before should be sort
> of a no-op, as we know this already. Could we short-circuit that
> by storing the docker-image-hash (or similar) and once we see this
> known one abort the discovery?
> 

Don't agents support user-defined feed IDs? Can we not pass in the feed ID as 
any other configuration (env var or something) and therefore, no matter how 
and where the feed "materializes" it identifies itself the same? If it 
identifies the same and inventory already has record of it, the discovery can 
stop quite soon, depending on how the tree hashes look like locally and on the 
inventory server.

If you're talking about "local" discovery - feed "waking up" and realizing it 
woke up in the same container as last time and therefore it is guaranteed the 
inventory will look the same - I'm not sure it is that easy to assume that - I 
can imagine the discovery results being dependent on the contents of some 
attached data volumes, values of env vars the container was started with, etc.

> Another aspect is certainly that we want to keep (some) historic
> records of the died container - e.g. some metrics and the point
> when it died. Suppose k8s kills a container and spins a new one
> up (same image) on a different node, then logically it is a continuation
> of the first one, but in a different place (but they have different feed
> ids)
> 

Again, feed id can be part of the pod configuration (passed in as env var) - 
that way feed id stays the same.

> Now a more drastic scenario: As orchestration systems like k8s or
> Docker-Swarm have their own registries, that can be queried : do we need
> a
> hawkular-inventory for this at all?

Using the same logic, our agent would be replaced by heapster with k8s. I 
think we always meant Hawkular to go deeper than the orchestration services.

But in general sense, yes, inventory is easily replacable by the registries, 
if the granularity of the data in the registry suffices.

> 
> ( We still need it for the non-OpenShift/K8s/Docker-Swarm envs )

-- 
Lukas Krejci