I agree 100% with what Jay and Larry suggest here, but I want to point out one
other thing that Jay mentioned on the last water-cooler call.
One of the problems with discovery and (un)inventory in RHQ was that by
default we discovered and inventoried A LOT. This then required some users to
go back and uninventory unnecessary stuff and ignore it in the discovery q.
What Jay suggested and I thought was a brilliant idea was to "start small" and
only discover a very "crude picture" of the resource tree with a hint of what
could be discovered in addition (i.e. discover an EAP resource with some basic
stats, no child resources like deployments, subsystems, etc with the hint of
what could be discovered coming from the hierarchical resource types (granted
those are not in inventory right now, but that is easily changed)).
On Tuesday, September 15, 2015 21:10:14 Larry O'Leary wrote:
On Tue, Sep 15, 2015 at 3:04 PM, Jay Shaughnessy
<jshaughn(a)redhat.com>
wrote:
> On 9/15/2015 3:12 PM, Heiko W.Rupp wrote:
> > On 15 Sep 2015, at 19:52, Jay Shaughnessy wrote:
> >> No. I think data should just live on until some TTL perhaps kicks in.
> >
> > That would work for metrics (out of the box).
> > What about other data like alert trigger definitions?
> > Or group memberships?
>
> Perhaps we look at a reaper sort of approach where we look for "dead"
> resources in inventory, based on some sort of criteria, and then perform
> clean-up.
>
>From my PoV the data should just be kept and the legacy concept of
uninventory should just become a hybrid of disable and ignore.
Collection of new data should not occur as the resource has been
"de-activated".
Alert trigger defs and group membership should be made invisible.
All historic data is retained until its TTL expires. And all configuration
defs/settings, collection schedules and the like, would just remain
Because the resource is "disabled" re-discovery shouldn't be an issue. If
the user wants to "re-activate" it, then everything remains intact. The
downside to this would be that a user could not "reset" the resource and
start from scratch. But, I am not really sure that this is a problem
assuming default configuration/settings/templates can be reapplied.
> > I also recall from RHQ that we had a situation where a user was
> > deploying my-super-app-v1.war and later my-super-app-v2.war,
> > where both deployments are the same app and even with a different
> > name just two different versions of the same resource and not
> > different resources. Also in this case, the user wants all metrics,
> > trigger definitions etc. to survive the deployment of v2.
>
> You are right about this and I hadn't forgot about it. But this feature
> is a b*tch. It was really hard in RHQ and may be nigh impossible in
> Hawkular unless we basically do the same thing and start stripping
> versions out of resource paths. Or maybe we're already doing that, I
> don't know. It depends on whether the agent is using the path as-is from
> wfly, which afaik is still using the artifact name in the path.
I think it would be best just to provide a resource reconciliation function
that allows a user to perform operation such as:
- Resource A and Resource B are the same
- Merge Resource A and Resource B as Resource A | B | C
- Resource B is version 2 of Resource A
Because resources would no longer be deleted/removed from inventory, it
allows the merging of updated/new resources without any issue -- assuming
the types are the same.
So, from your original questions:
Some questions that come to mind:
> - Should we wipe data when a server gets uninventoried?
No.
- How can we prevent it from entering inventory directly after an
> uninventory again?
If it is simply disabled/ignored, we won't have to. Only need a way to
allow the user to allow it to be "re-activated". This would be similar to
what was done in RHQ with re-discovery.
> - what states do we need (data wiping may take too long to be done
> synchronously)?
No necessary. Data should just expire on its own based on data retention
settings.
- would users perhaps want a data dump to external storage before
> wiping?
Yes. For the resource that was "de-activated" the user should still be able
to get to its data/metric views and grab the data that has not yet expired.
> - How can we make sure that servers without that uuid can still be
> identified even after rename (of the machine)?
> - How can we make sure that servers without that uuid can still be
> identified even when they are moved to a different pod
As Jay indicated in his response, I am not sure this is really a problem.
Wouldn't a new UUID equal a new resource?
Assuming there was a reconciliation feature, the user could re-link a new
UUID to an existing resource with a different UUID if they are confident
that it is the same resource. It would be preferred that this could happen
automatically, but ideally the user knows best and should have the ability
to link/merge the resources.