On Apr 8, 2015, at 10:07 AM, Michael Burman
<miburman(a)redhat.com> wrote:
Hi,
Earlier today I created HWKMETRICS-54, but I later thought about it a bit more and to me
it looks like we're not sure what the tags should really do and how the system should
be used. Lets assume the following scenario:
You are correct to some degree about tags. We do know that we want to support
querying/filtering by tags. We also plan to use tags for configuring things like data
retention and aggregation. Right now we have limited support for querying for data points
by tags. We do not yet have support for querying by metric tags.
3 machines, each one running an agent that provides data for memory.free, memory.cached,
cpu.idle, cpu.user, cpu.system, disk.free, disk.used
Now what user might want to do in these cases is:
a) I want to get all the statistics affecting the host 1
b) I want to get all the memory.free statistics from each host
Think about the data modeling in our current hawkular-metrics for a moment. The user
starts:
I. Model everything with machine1.memory.free, machine2.memory.free etc
a) How to query machine1.* ? Can't be done. c) How to get *.memory.free? Can't be
done.
It cannot be done yet because we only have support right now for filtering individual data
points by tag(s). Assuming each metric from machine1 has the tag, machine=machine1, then I
think we should support filtering like,
machine = machine1 —> retrieve data points for all metrics on
machine1
machine = * —> retrieve data points for all metrics
on machines 1, 2, 3
machine = [machine1, machine2] —> retrieve data points from machines 1, 2
machine = regular expression —> retrieve data points for metrics with machine tag
where values matches regex
I think this would handle both querying machine1 metrics and querying memory.free metrics.
How would the query succeed with our current format? By creating tags for every occasion
on metric creation:
create machine1.memory.free (tags: machine='machine1',
category='memory')
create machine2.memory.free (tags: machine='machine2',
category='memory')
What the user finds out is that "why on earth do I have the metricId at all" ?
It's a good question. In our current structure we should remove metricId and just
invent something random for better Cassandra partitioning.
What it probably should look like (and this is how I assumed it was to be done until I
checked the unit tests and find out that there's nothing pointing to this OpenTSDB
familiar method):
II. memory.free, cpu.idle are the metricIds and I'll define it has a parameter
'machine'. When pushing a metric I set a tag with a value, such as
machine='machine2’.
You have strayed into the topic of adding more explicit grouping of metrics, something I
have thought about as well. Remember that metric id is basically the primary key, and the
PK consists of the partition key and any number of optional clustering columns. Partition
keys have to be unique. In your example the tag machine=machine2 would really have to be
a special tag that winds up being part of the partition key; otherwise, our keys will not
be unique.
I think that I tend to favor more explicit grouping. Tags can be changed, and the
partition key cannot change. So let’s say we have an explicit (and maybe optional)
grouping parameter. We can use host. Then the actual partition key still winds up being
machine1.memory.free and so on. And sure, the client can query for all metrics belonging
to machine1 just by specifying host = machine1. Isn’t this just a special case of
filtering by tags?
Now when I fetch the metric "memory.free", I can get all the memory.free
valuess with 'machine'-tag indicating which machine it was gathered from. If I
need to search for all machine-statistics, then I could use the tag-searching. If I wanted
only machine1 memory.free, I would add a filter: tag machine='machine1'.
Or how are we supposed to model real-world-use-cases? The current model is quite
cumbersome and not even necessarily doable in many cases (am I supposed to query for a
metric definition before pushing any metric - because a new container could give me new
set of parameters or a new machine new set of machine parameters and I need to remember to
register them instead of pre-defined types which I might know).
- Micke
_______________________________________________
hawkular-dev mailing list
hawkular-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hawkular-dev