[metrics] Internal stats?
by Heiko W.Rupp
Hey,
what internal stats of the Hawkular metrics do we currently collect?
I think Joel did some work for the C* part.
What I think we need is
- number of data points stored on a per tenant basis.
Resolution could be something like "last minute" or
"last 5 minutes" I.e. not realtime updates in the table.
- Total number of data points (i.e. sum over all tenants)
- Query stats. This is probably more complicated, as
querying on metrics that are still in some buffer is
cheaper than over 3 years of raw data.
To get started I'd go with # of queries per tenant and global
Those could perhaps be differentiated on
- raw endpoint
- stats endpoint
- What about alerting? More alert definitions certainly
need more cpu, so number of alert definitions per tenant
and total would be another pair.
- does number of fired alerts also make sense?
The idea behind those is to get some usage figures of the
shared resource "Hawkular metrics" and then to be able to
charge them back onto individual tenants e.g. inside of
OpenShift.
7 years, 11 months
hosa status endpoint now secured behind openshift secret
by John Mazzitelli
If you are deploying HOSA using its makefile and you are using HOSA's status endpoint (heiko :-) you might want to update your blogs on this), just a heads up that the /status endpoint is now secured behind credentials defined in an openshift secret. So if you point your browser to the new route, for example, you'll see it asks you for username/password now.
By default, the status endpoint is disabled, but the yaml our Makefile uses will enable it and put it behind a secret that is created for you. The credentials are fixed in the secret the makefile creates (see the config.yaml example file to know what they are - its the same credentials that are in the secret) but you are free to base64 encode your own credentials in a secret and use that.
8 years
Inventory and 'transient' servers
by Heiko W.Rupp
Hey,
when WildFly connects to Inventory for the 1st time, we sync
the EAP information into inventory, which also includes the information
about which metrics are available.
Now when WildFly is deployed into Kubernetes or OpenShift, we
will see that WildFly is started, syncs and then dies at some point
in time, where k8s will not re-use the existing one, but start
a new one, which will have a different FeedId.
This will leave a WildFly in Inventory, that is later detected as
down in Metrics/Alerting, but the one in Inventory will stay
forever. Consequences are
- Inventory will get "full" with no-longer needed information
- Clients will retrieve data about non-"usable" servers
We need to figure out how to deal with this situation like e.g.:
- have inventory listen on k8s events and remove the server
when k8s removes it (not when it is going down; stopped pods
can stay around for some time)
- Have a different way of creating the feed id / server id so that
the state is "re-used". Something like feedId/server name could
be the name of the deployment config + version + k8s tenant.
Thoughts?
8 years
Hawkular drill down on calls
by Kavin Kankeshwar
Hi,
Iam using the Hawkular JVM agent to send metrics to Hawkular controller,
But the one thing which i cannot do is drill down into where the time was
spent. i.e. drill down to the class which was taking time. (something like
AppDynamics Agent)
So wanted to check if i am missing something or the feature is not yet
possible in hawkular.
Regards,
--
Kavin.Kankeshwar
8 years
Ability to group by datapoint tag in Grafana
by Gareth Healy
The OpenShift Agent when monitoring a prometheus endpoint creates a single
metric with tagged datapoints, i.e.:
https://github.com/coreos/etcd/blob/master/Documentation/v2/metrics.md#
http-requests
I1228 21:02:01.820530 1 metrics_storage.go:155] TRACE: Stored [3]
[counter] datapoints for metric named
[pod/fa32a887-cd08-11e6-ab2e-525400c583ad/custom/etcd_http_received_total]:
[
{2016-12-28 21:02:01.638767339 +0000 UTC 622 map[method:DELETE]}
{2016-12-28 21:02:01.638767339 +0000 UTC 414756 map[method:GET]}
{2016-12-28 21:02:01.638767339 +0000 UTC 33647 map[method:PUT]}
]
But when trying to view this via the grafana datasource, only 1 metric and
the aggregated counts are shown. What i'd like to do is something like the
below:
{
"start": 1482999755690,
"end": 1483000020093,
"order": "ASC",
"tags": "pod_namespace:etcd-testing",
"groupDatapointsByTagKey": "method"
}
Search via tags or name (as-is) and group the datapoints by a tag key,
which would give you 3 lines, instead of 1.
Does that sound possible?
Cheers.
8 years
Fwd: [hawkular-dev] Grafana querying usability
by Joel Takvorian
Hi everybody,
I would like to get some opinions on the hawkular-grafana-datasource
querying usability, especially if you had the opportunity to create
dashboards recently and had to juggle with querying by metric name and by
tag.
Currently the panel displays different elements depending on you're
querying by metric name (see picture:
https://raw.githubusercontent.com/hawkular/hawkular-grafana-datasource/ma...
)
Querying by name is quite straightforward, but it can be cumbersome when
there's a lot of available metrics and you have to scroll among suggestions
to find the one you want.
Or by tag (see picture:
https://raw.githubusercontent.com/hawkular/hawkular-grafana-datasource/ma...
)
The "query by tag" interface is not very intuitive IMO (you define a list
of key-value pairs), moreover to this date there is no auto-completion on
tag name.
There's been some features in metrics recently that, I think, can enable a
better UI on Grafana side. First, there is Micke's feature "Allow fetching
of available tagNames" [1] that enables suggestions (auto-completion) on
tag name. And most importantly there's the new "tag query language" that
could (should) have its place in Grafana.
So I would have several suggestions for improving the UI and queries.
*1*: First of all I think we can remove the "Search by name" / "Search by
tag" selector, and allow searching by name AND by tag at the same time:
providing tags would refine the available metrics in the metrics text field
suggestions (auto-completion). If this text field is left blank, all
available metrics are displayed.
Then, there's several scenarios to take advantage of the new hawkular
metrics features:
*2a*: keep current key/value pairs system, but improve with adding
suggestion on tag names.
*2b*: replace key-value pairs with the new tags query, via a simple text
field:
We may or may not include syntax validation here. We must provide some
documentation on that.
*2c*: replace key-value pairs with the new tags query, with a dedicated
"builder" UI:
Each of these boxes in tags query would have a contextual auto-completion
feature:
- suggestions on 1: list of tag keys
- suggestions on 2: list of operators (=, !=, IN, NOT IN)
- suggestions on 3: list of tag values for the given key (with some slights
differences on brackets if it's the first element or not; closing bracket
as a choice if it's not the first element)
- suggestions on 4: operators AND / OR
The 2b option is obviously simpler, very fast to implement. It has the
downside of loosing all auto-completion capabilities, even compared to the
current version.
2c looks nicer and more intuitive in its usage, people won't have to read
the doc to use it. However there's several downsides:
- Need to implement the logic => need for time for development, adds
complexity to our grafana plugin.
- Introduces a dependency to the language on server side. When the
language evolves, we'll have to maintain this as well.
Ideas & thoughts? I think it's preferable to proceed in several steps. In a
first time I could implement *1* and *2a*, and later (and maybe giving some
time to the language to be "stabilized") going with *2c*.
[1] https://issues.jboss.org/browse/HWKMETRICS-532
Thanks
Joel
8 years
what to name metrics in HOSA?
by John Mazzitelli
HOSA has its own Prometheus endpoint that emits its own metrics. Right now, we just have one custom agent metric but we plan to add more. Before I get too far, I'm trying to figure out a good prefix to use for metric names.
I was looking over the Prometheus naming conventions for metric names here: https://prometheus.io/docs/practices/naming/
In addition, I found additional naming conventions listed in the Prom Go client comments here: https://github.com/prometheus/client_golang/blob/master/prometheus/metric...
Right now the one custom agent metric is called:
hawkular_openshift_agent_metric_data_points_collected_total
I think it's too long :) And the "subsystem" is two words (openshift_agent) when the Go comment says (and all other prometheus metrics I've seen) use one word with no underscore.
I think starting it with "hawkular_" is good because looking at the metric you immediately know it is from a Hawkular component. But I don't know what the subsystem should be.
I was thinking:
hawkular_openshiftagent_<metric-name>
That is a one-word subsystem "openshiftagent" but its still too long IMO. Maybe:
hawkular_agent_<metric-name>
But then, if our other agents emit their own metrics in the future, this will be confusing (think vert.x agent, fuse agent, whatever).
How about use the HOSA abbreviation?
hawkular_hosa_<metric-name>
That seems smaller and is more specific to the OpenShift Agent. But will "HOSA" make sense to people?
Thoughts? Suggestions?
8 years
Hawkular in OpenShift
by Thomas Heute
When I use:
oc cluster up --metrics=true
Hawkular metrics fails until I do:
oc adm policy add-role-to-user view
system:serviceaccount:openshift-infra:hawkular -n openshift-infra
Should that last command be part of the --metrics=true magic ?
Thomas
8 years