On 9/15/2015 11:01 AM, Heiko W.Rupp wrote:
Hey,
currently we can add (WF) servers to inventory easily, but we can't get
rid of them anymore. Similarly it is not really clear what should happen
if a serve changes its name.
In WildFly 10 supposedly a new uuid kind of thing should appear, that
would identify each WF uniquely and which could serve as the "primary
key", so that the name is just a display attribute that can be changed
at will, so that the renaming piece is sort of solved.
Now to the semantics of "uninventory" (not deletion of physical
servers).
In RHQ we had uninventory that removed the server from inventory.
After a discovery scan happened, it showed up again and could be
re-inventoried.
During uninventory, all collected metrics etc. and alert definitions for
the resource
were wiped (or prepared for wiping).
This (wiping of recorded metrics) has more than one time been mentioned
as undesired e.g. in case that the uninventory happened due to bad luck.
Also in RHQ there were issues when e.g. the machine where a WF-server
was running on got renamed.
Before answering the questions below. I'm not so sure uninventory
should be a thing. In RHQ it meant a wipe of server-side data. In
hawkular I wonder whether it just means turning off the feed of that
data and letting it disappear from the recent time windows. Filtering,
if necessary, should maybe be handled in the presentation layer. Having
a feed be able to report the set of resources it *can* report on, and
then supplying it a list of the resources it *should* report on, seems
like a general mechanism we could somehow build out. Or, it could be
less granular, just down to type level. The initial set of types should
be small.
Some questions that come to mind:
- Should we wipe data when a server gets uninventoried?
No. I think data should just live on until some TTL perhaps kicks in.
I do sometimes wonder how important really old data is. Maybe useful
for trend analysis but not so much for problem resolution, which is
probably more reliant on say the last 12-48 hours.
- How can we prevent it from entering inventory directly after an
uninventory again?
If we are able to supply a feed with a list of types/resources to not
report on then it's not a problem.
- what states do we need (data wiping may take too long to be done
synchronously)?
Don't wipe data, let it be purged later and just be irrelevant based on
the query windows being looked at.
- would users perhaps want a data dump to external storage before
wiping?
Not a problem if we keep the data.
- How can we make sure that servers without that uuid can still be
identified even after rename (of the machine)?
I think this is a problem to ignore. If the feed name/resource path
changes then it's basically new resource. I don't think we need to be
this complex. If we maintain our feed name I think the resource path
should remain the same for the resources it is monitoring. If we don't
do that already then we should look at doing that.
- How can we make sure that servers without that uuid can still be
identified even when they are moved to a different pod
Maybe the same as above, I'm not sure.