Fwd: [hawkular-dev] Grafana querying usability
by Joel Takvorian
Hi everybody,
I would like to get some opinions on the hawkular-grafana-datasource
querying usability, especially if you had the opportunity to create
dashboards recently and had to juggle with querying by metric name and by
tag.
Currently the panel displays different elements depending on you're
querying by metric name (see picture:
https://raw.githubusercontent.com/hawkular/hawkular-grafana-datasource/ma...
)
Querying by name is quite straightforward, but it can be cumbersome when
there's a lot of available metrics and you have to scroll among suggestions
to find the one you want.
Or by tag (see picture:
https://raw.githubusercontent.com/hawkular/hawkular-grafana-datasource/ma...
)
The "query by tag" interface is not very intuitive IMO (you define a list
of key-value pairs), moreover to this date there is no auto-completion on
tag name.
There's been some features in metrics recently that, I think, can enable a
better UI on Grafana side. First, there is Micke's feature "Allow fetching
of available tagNames" [1] that enables suggestions (auto-completion) on
tag name. And most importantly there's the new "tag query language" that
could (should) have its place in Grafana.
So I would have several suggestions for improving the UI and queries.
*1*: First of all I think we can remove the "Search by name" / "Search by
tag" selector, and allow searching by name AND by tag at the same time:
providing tags would refine the available metrics in the metrics text field
suggestions (auto-completion). If this text field is left blank, all
available metrics are displayed.
Then, there's several scenarios to take advantage of the new hawkular
metrics features:
*2a*: keep current key/value pairs system, but improve with adding
suggestion on tag names.
*2b*: replace key-value pairs with the new tags query, via a simple text
field:
We may or may not include syntax validation here. We must provide some
documentation on that.
*2c*: replace key-value pairs with the new tags query, with a dedicated
"builder" UI:
Each of these boxes in tags query would have a contextual auto-completion
feature:
- suggestions on 1: list of tag keys
- suggestions on 2: list of operators (=, !=, IN, NOT IN)
- suggestions on 3: list of tag values for the given key (with some slights
differences on brackets if it's the first element or not; closing bracket
as a choice if it's not the first element)
- suggestions on 4: operators AND / OR
The 2b option is obviously simpler, very fast to implement. It has the
downside of loosing all auto-completion capabilities, even compared to the
current version.
2c looks nicer and more intuitive in its usage, people won't have to read
the doc to use it. However there's several downsides:
- Need to implement the logic => need for time for development, adds
complexity to our grafana plugin.
- Introduces a dependency to the language on server side. When the
language evolves, we'll have to maintain this as well.
Ideas & thoughts? I think it's preferable to proceed in several steps. In a
first time I could implement *1* and *2a*, and later (and maybe giving some
time to the language to be "stabilized") going with *2c*.
[1] https://issues.jboss.org/browse/HWKMETRICS-532
Thanks
Joel
9 years, 1 month
what to name metrics in HOSA?
by John Mazzitelli
HOSA has its own Prometheus endpoint that emits its own metrics. Right now, we just have one custom agent metric but we plan to add more. Before I get too far, I'm trying to figure out a good prefix to use for metric names.
I was looking over the Prometheus naming conventions for metric names here: https://prometheus.io/docs/practices/naming/
In addition, I found additional naming conventions listed in the Prom Go client comments here: https://github.com/prometheus/client_golang/blob/master/prometheus/metric...
Right now the one custom agent metric is called:
hawkular_openshift_agent_metric_data_points_collected_total
I think it's too long :) And the "subsystem" is two words (openshift_agent) when the Go comment says (and all other prometheus metrics I've seen) use one word with no underscore.
I think starting it with "hawkular_" is good because looking at the metric you immediately know it is from a Hawkular component. But I don't know what the subsystem should be.
I was thinking:
hawkular_openshiftagent_<metric-name>
That is a one-word subsystem "openshiftagent" but its still too long IMO. Maybe:
hawkular_agent_<metric-name>
But then, if our other agents emit their own metrics in the future, this will be confusing (think vert.x agent, fuse agent, whatever).
How about use the HOSA abbreviation?
hawkular_hosa_<metric-name>
That seems smaller and is more specific to the OpenShift Agent. But will "HOSA" make sense to people?
Thoughts? Suggestions?
9 years, 1 month
Hawkular in OpenShift
by Thomas Heute
When I use:
oc cluster up --metrics=true
Hawkular metrics fails until I do:
oc adm policy add-role-to-user view
system:serviceaccount:openshift-infra:hawkular -n openshift-infra
Should that last command be part of the --metrics=true magic ?
Thomas
9 years, 1 month
hosa and its own role
by John Mazzitelli
Playing around with OpenShift roles, I found the agent doesn't need the vast majority of permissions the cluster-reader role provides.
So, rather than assign the agent to the cluster-reader role, I instead create a single role for the agent to be given where that role provides only the permissions the agent actually needs to do its job and no others:
https://github.com/hawkular/hawkular-openshift-agent/pull/87/files#diff-e...
So far, this looks to be working. Heiko, feel free to try this out. Its part of that use-secrets PR/branch.
9 years, 2 months
hosa using secrets for endpoint credentials
by John Mazzitelli
Right now, the Hawkular OpenShift Agent (HOSA) can pass HTTP authentication headers to endpoints it is monitoring, but you have to declare the credentials in the pod's configmap's endpoints section:
endpoints:
- type: jolokia
credentials:
username: myuser
password: mypass
We would like to figure a better way. One way Heiko mentioned was to see if we can use OpenShift's secrets right here in the credentials section.
So I created a PoC to see if and how it can work. I have it working here in my own branch:
https://github.com/jmazzitelli/hawkular-openshift-agent/tree/use-secrets
After building and deploying the agent in my OpenShift environment, I then created a secret (via the OS console) in my project where my jolokia pod lives - the secret is called "foo2" which has two keys defined: "password" and "username". I then tell the agent about this by passing in credentials as I describe above, but I prefix the values with "secret:" to tell the agent to expect the actual values to be found in the secret. The full syntax of the credentials values are "secret:<secret name>/<secret key>". So for example, I can have this in my configmap:
endpoints:
- type: jolokia
credentials:
username: secret:foo2/username
password: secret:foo2/password
It can optionally use bearer tokens:
endpoints:
- type: jolokia
credentials:
token: secret:foo2/password
There is one problem with this. I need to add a cluster role to the agent to read secrets (I need verb "get" on resource "secrets" - for testing, I am using the "system:node" role since that is one of the few that has that permission - we'd really want a cluster role that only has "get"/"secrets" - we don't need all the perms that "system:node" provides - we'd have to create our role if need be).
But is this good? I do not know of any other way for the agent to be able to read secrets. Is it OK to require the agent to have "get" "secrets" permission? Is there another way to access secrets?
9 years, 2 months
[hawkular-apm] tracing async operations
by John Sanda
I am reading through some of the Hawkular APM code and have been looking at how trace fragments get created and written on the client side. One of the classes I am reviewing is FragmentBuilder.java. Its javadocs state that the sequence of events within a thread of execution should be in sequence. It made me wonder whether it is possible to trace async operations. I found https://issues.jboss.org/browse/HWKAPM-77 <https://issues.jboss.org/browse/HWKAPM-77> which apparently improves support for async execution.
Gary, can you or anyone else familiar how things work, give a brief explanation of how things work async flows? Most everything in hawkular-metrics is async and reactive. That makes debugging difficult at times where you have to rely primarily on logging. I am wondering if Hawkular APM would be a good tool for these types of situations.
- John
9 years, 2 months
Integrating APM data into H-Metrics
by Gary Brown
Hi
Wanted to discuss a proposal for recording some metric data captured from Hawkular APM in Hawkular Metrics.
For those not familiar with Hawkular APM, it captures the end to end trace instance (think of it as a distributed call stack), for each invocation of an application. This trace can include information about the communications between services, but can also include details about internal components used within the services (e.g. EJBs, database calls, etc).
First point is that if we were to record duration metrics for each 'span' captured (i.e. scope within which a task is performed), for each invocation of an application, then it would result in a large amount of data that may be of no interest. So we need to find a way for end users/developers to express which key points within an application they do want recorded as metrics.
The proposal is to allow the application/services to define a tag/property on the spans of interest, e.g. 'kpi', that would indicate to the server that the duration value for the span should be stored in H-Metrics. The value for the tag should define the name/description of the KPI.
If considered a suitable solution, then we can also propose it as a standard tag in the OpenTracing standard.
There are a couple of metrics that we could record by default, first being the trace instance completion time, and the second possibly being the individual service response times (although this could potentially also be governed by the 'kpi' tag).
Thoughts?
Regards
Gary
9 years, 2 months