I added a comment to HWKMETRICS-54 that is worth repeating here. We support tagging at two different levels - 1) the metric or times as a whole and 2) individual data points. Tags applied to individual data points will expire alongside the data. Metric tags on the other hand do not expire. I anticipate using metric tags most frequently for query filtering as discussed below, providing meta data that might be used by other services such as inventory or alerting, and configuring things like aggregation and data retention.

On Apr 8, 2015, at 10:51 AM, John Sanda <jsanda@redhat.com> wrote:


On Apr 8, 2015, at 10:07 AM, Michael Burman <miburman@redhat.com> wrote:

Hi,

Earlier today I created HWKMETRICS-54, but I later thought about it a bit more and to me it looks like we're not sure what the tags should really do and how the system should be used. Lets assume the following scenario:

You are correct to some degree about tags. We do know that we want to support querying/filtering by tags. We also plan to use tags for configuring things like data retention and aggregation. Right now we have limited support for querying for data points by tags. We do not yet have support for querying by metric tags.


3 machines, each one running an agent that provides data for memory.free, memory.cached, cpu.idle, cpu.user, cpu.system, disk.free, disk.used

Now what user might want to do in these cases is:

a) I want to get all the statistics affecting the host 1
b) I want to get all the memory.free statistics from each host

Think about the data modeling in our current hawkular-metrics for a moment. The user starts:

I. Model everything with machine1.memory.free, machine2.memory.free etc
a) How to query machine1.* ? Can't be done. c) How to get *.memory.free? Can't be done.

It cannot be done yet because we only have support right now for filtering individual data points by tag(s). Assuming each metric from machine1 has the tag, machine=machine1, then I think we should support filtering like,

machine = machine1                     —> retrieve data points for all metrics on machine1
machine = *                                   —> retrieve data points for all metrics on machines 1, 2, 3
machine = [machine1, machine2] —> retrieve data points from machines 1, 2
machine = regular expression       —> retrieve data points for metrics with machine tag where values matches regex

I think this would handle both querying machine1 metrics and querying memory.free metrics.                          


How would the query succeed with our current format? By creating tags for every occasion on metric creation:

create machine1.memory.free (tags: machine='machine1', category='memory')
create machine2.memory.free (tags: machine='machine2', category='memory')

What the user finds out is that "why on earth do I have the metricId at all" ? It's a good question. In our current structure we should remove metricId and just invent something random for better Cassandra partitioning.

What it probably should look like (and this is how I assumed it was to be done until I checked the unit tests and find out that there's nothing pointing to this OpenTSDB familiar method):

II. memory.free, cpu.idle are the metricIds and I'll define it has a parameter 'machine'. When pushing a metric I set a tag with a value, such as machine='machine2’.

You have strayed into the topic of adding more explicit grouping of metrics, something I have thought about as well. Remember that metric id is basically the primary key, and the PK consists of the partition key and any number of optional clustering columns. Partition keys have to be unique. In your example the tag machine=machine2 would really have to be a special tag that winds up being part of the partition key; otherwise, our keys will not be unique.

I think that I tend to favor more explicit grouping. Tags can be changed, and the partition key cannot change. So let’s say we have an explicit (and maybe optional) grouping parameter. We can use host. Then the actual partition key still winds up being machine1.memory.free and so on. And sure, the client can query for all metrics belonging to machine1 just by specifying host = machine1. Isn’t this just a special case of filtering by tags?


Now when I fetch the metric "memory.free", I can get all the memory.free valuess with 'machine'-tag indicating which machine it was gathered from. If I need to search for all machine-statistics, then I could use the tag-searching. If I wanted only machine1 memory.free, I would add a filter: tag machine='machine1'.

Or how are we supposed to model real-world-use-cases? The current model is quite cumbersome and not even necessarily doable in many cases (am I supposed to query for a metric definition before pushing any metric - because a new container could give me new set of parameters or a new machine new set of machine parameters and I need to remember to register them instead of pre-defined types which I might know).

 - Micke


_______________________________________________
hawkular-dev mailing list
hawkular-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hawkular-dev


_______________________________________________
hawkular-dev mailing list
hawkular-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hawkular-dev