Hi,
I just want to share some ideas about the eventuality of having a language
to perform some arithmetic / aggregations on metrics, before it goes out of
my head...
Here's an example of what I would personally love to see in Hawkular:
----------------------------------------------
*Example:*
*sum(stats(rate((id(my_metric), tags(a=foo AND b=bar),
regexp(something_.+)), 5m), 10m))*
|
|=> "id", "tags" and "regexp" all return a set of raw
metrics (0-1 for id,
0-n for tags and regexp)
|==> "(a,b,c)" takes n parameters, all sets of metrics, and flatten them in
a single set
|===> rate(set_of_raw_metrics, rate_period) computes the rate for each of
them and return a set of metrics (map n=>n)
|====> stats(set_of_raw_metrics, bucket_size) bucketize the raw metrics,
returning the same number of bucketized metrics (map n=>n)
|=====> sum(set_of_stats_metrics) sums every buckets, returning a single
bucketized metric (fold n=>1)
*Other:*
Functions like "sum" that take stats_metrics could have overloaded shortcut
"sum(set_of_raw_metrics, bucket_size)" to perform the bucketing.
In other words above example could be rewritten:
*sum(rate((id(my_metric), tags(a=foo AND b=bar), regexp(something_.+)),
5m), 10m)*
Note: we can do aggregations like "sum" on raw data if necessary, it just
means we have to interpolate.
*Scalar operations:*
*sum((id(a_metric_in_milliseconds), 1000*id(a_metric_in_seconds)), 10m)*
----------------------------------------------
Of course many other functions could come growing the library.
Now I suppose the big question, if we want to do such thing, is "are we
going to invent our own language?" I don't know if there are standards for
this, and if they are good.
The Prometheus query language cannot be transposed because a label is not a
tag and it makes no sense for us to write something like
"my_metric{tag=foo}", it's either "my_metric" or
"tag=foo".
The same language could be used both for read-time on-the-fly aggregations
and write-time / cron-based rollups.
WDYT?