On Jul 22, 2015, at 10:13 AM, Thomas Segismont
<tsegismo(a)redhat.com> wrote:
Hi everyone,
Right now, when you query data in Metrics, you can, given a time range:
- get raw data
- get raw data having some tags
- get a "bucketed" view of raw data with on-the fly aggregates
- get periods for which a condition on the value holds true
But you can't mix these capabilities.
For example, you can't ask for periods where the avg of a gauge,
computed over 1 min buckets, is greater than a threshold.
It's not possible either to express AND/OR in tags or condition queries.
gayak has already added support for OR and is working on support for AND in
HWKMETRICS-180. He is also looking into a mini query language for tag filtering as part of
this work.
Searching by tags is useful, but users have to tag data manually (unless
the collector adds some predefined tags). It would be useful to be able
to operate on data coming from different metrics having similar names:
like tell me the average cpu usage over all my web hosts, provided the
metrics are _webhost*.cpu.usage_
We added support for tagging data very early on in the project, but I am becoming more of
the mindset that we do not want this. Tagging metrics/time series and subsequently
searching for metric data data based on those tags are common use cases. It might be worth
thinking about events in place of tagging individual data points.
When aggregating data, users should able to provide the name of the
function, be it a builtin or user defined function.
I think user defined functions sound nice but not that critical. I think that we can cover
most use cases with built-in functions.
Last but not least, when rollups will be implemented, users should
probably not have to care about where the data is stored,if they ask for
the 1-month average of a metric over the past year, or the 30-seconds
average over the past five minutes (think about the UI zooming graphs).
Agreed. The fact that we might be generating and storing multiple time series for a metric
is in large part an implementation detail to make querying more efficient and to reduce
storage footprint on disk. In RHQ we never exposed the generated time series. The date
range in request determined which time series we queried.
Whether we generate aggregated metrics in an ad hoc fashion at query time or they are
pre-computed, we are still dealing with the same data types/structures. There should not
be two separate APIs.
The examples are not completely fabricated, it's feedback I got while
presenting Hawkular. I understood Stefan heard similar comments during
Summit.
I'm starting this thread to gather inputs so that we can build a
powerful, unified query API. Feel free to provide more use cases and/or
ideas for implementation.
Thanks!
Thomas
_______________________________________________
hawkular-dev mailing list
hawkular-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hawkular-dev