Hawkular in OpenShift
by Thomas Heute
When I use:
oc cluster up --metrics=true
Hawkular metrics fails until I do:
oc adm policy add-role-to-user view
system:serviceaccount:openshift-infra:hawkular -n openshift-infra
Should that last command be part of the --metrics=true magic ?
Thomas
7 years, 8 months
hosa and its own role
by John Mazzitelli
Playing around with OpenShift roles, I found the agent doesn't need the vast majority of permissions the cluster-reader role provides.
So, rather than assign the agent to the cluster-reader role, I instead create a single role for the agent to be given where that role provides only the permissions the agent actually needs to do its job and no others:
https://github.com/hawkular/hawkular-openshift-agent/pull/87/files#diff-e...
So far, this looks to be working. Heiko, feel free to try this out. Its part of that use-secrets PR/branch.
7 years, 8 months
some new hawkular openshift agent stuff
by John Mazzitelli
FYI: some new things went into HOSA (that's the Hawkular OpenShift Agent for the uninitiated).
1. The agent now emits its own metrics and can monitor itself. Right now it just emits some basic "go" metrics like memory usage, CPU usage, etc along with one agent-specific one - a counter that counts the number of data points it has collected in its lifetime. We'll add more metrics as we figure out the things people want to see, but we have the infrastructure in place now.
2. The agent is deployed as a daemonset. This means as new nodes are brought online, an agent will go along with it (or so I'm told :)
3. The agent has changed the way it discovers what to monitor - it no longer looks at annotations on pods to determine where the configmaps are for those pods. Instead, it looks up volume declarations to see if there is an agent configmap defined. This was done to be ready for the future when new security constraints will be introduced in OpenShift which would have broken our annotation approach. This approach using volumes should not hit that issue.
NOTE: If you are building the latest agent from master, we added some dependencies so you have to update your dependencies via Glide by using the "make update-deps" target prior to building from source.
7 years, 8 months
hosa using secrets for endpoint credentials
by John Mazzitelli
Right now, the Hawkular OpenShift Agent (HOSA) can pass HTTP authentication headers to endpoints it is monitoring, but you have to declare the credentials in the pod's configmap's endpoints section:
endpoints:
- type: jolokia
credentials:
username: myuser
password: mypass
We would like to figure a better way. One way Heiko mentioned was to see if we can use OpenShift's secrets right here in the credentials section.
So I created a PoC to see if and how it can work. I have it working here in my own branch:
https://github.com/jmazzitelli/hawkular-openshift-agent/tree/use-secrets
After building and deploying the agent in my OpenShift environment, I then created a secret (via the OS console) in my project where my jolokia pod lives - the secret is called "foo2" which has two keys defined: "password" and "username". I then tell the agent about this by passing in credentials as I describe above, but I prefix the values with "secret:" to tell the agent to expect the actual values to be found in the secret. The full syntax of the credentials values are "secret:<secret name>/<secret key>". So for example, I can have this in my configmap:
endpoints:
- type: jolokia
credentials:
username: secret:foo2/username
password: secret:foo2/password
It can optionally use bearer tokens:
endpoints:
- type: jolokia
credentials:
token: secret:foo2/password
There is one problem with this. I need to add a cluster role to the agent to read secrets (I need verb "get" on resource "secrets" - for testing, I am using the "system:node" role since that is one of the few that has that permission - we'd really want a cluster role that only has "get"/"secrets" - we don't need all the perms that "system:node" provides - we'd have to create our role if need be).
But is this good? I do not know of any other way for the agent to be able to read secrets. Is it OK to require the agent to have "get" "secrets" permission? Is there another way to access secrets?
7 years, 8 months
[hawkular-apm] tracing async operations
by John Sanda
I am reading through some of the Hawkular APM code and have been looking at how trace fragments get created and written on the client side. One of the classes I am reviewing is FragmentBuilder.java. Its javadocs state that the sequence of events within a thread of execution should be in sequence. It made me wonder whether it is possible to trace async operations. I found https://issues.jboss.org/browse/HWKAPM-77 <https://issues.jboss.org/browse/HWKAPM-77> which apparently improves support for async execution.
Gary, can you or anyone else familiar how things work, give a brief explanation of how things work async flows? Most everything in hawkular-metrics is async and reactive. That makes debugging difficult at times where you have to rely primarily on logging. I am wondering if Hawkular APM would be a good tool for these types of situations.
- John
7 years, 8 months
Integrating APM data into H-Metrics
by Gary Brown
Hi
Wanted to discuss a proposal for recording some metric data captured from Hawkular APM in Hawkular Metrics.
For those not familiar with Hawkular APM, it captures the end to end trace instance (think of it as a distributed call stack), for each invocation of an application. This trace can include information about the communications between services, but can also include details about internal components used within the services (e.g. EJBs, database calls, etc).
First point is that if we were to record duration metrics for each 'span' captured (i.e. scope within which a task is performed), for each invocation of an application, then it would result in a large amount of data that may be of no interest. So we need to find a way for end users/developers to express which key points within an application they do want recorded as metrics.
The proposal is to allow the application/services to define a tag/property on the spans of interest, e.g. 'kpi', that would indicate to the server that the duration value for the span should be stored in H-Metrics. The value for the tag should define the name/description of the KPI.
If considered a suitable solution, then we can also propose it as a standard tag in the OpenTracing standard.
There are a couple of metrics that we could record by default, first being the trace instance completion time, and the second possibly being the individual service response times (although this could potentially also be governed by the 'kpi' tag).
Thoughts?
Regards
Gary
7 years, 8 months
Hawkular Metrics 0.23.0 - Release
by Stefan Negrea
Hello,
I am happy to announce release 0.23.0 of Hawkular Metrics. This release is
anchored by performance and stability improvements.
Here is a list of major changes:
- *Performance*
- Prevent BusyPoolException under heavy load due no available
connection and queue reaching max size of 256 (HWKMETRICS-542
<https://issues.jboss.org/browse/HWKMETRICS-542>)
- Gatling load tests have a new option (loops) to specify the number
of requests per client (HWKMETRICS-559
<https://issues.jboss.org/browse/HWKMETRICS-559>)
- *Deployment*
- Resolved an issue with resource-env-ref in component war (
HWKMETRICS-541 <https://issues.jboss.org/browse/HWKMETRICS-541>)
- Updated packaging to support deployments on WildFly 10.1.0 (
HWKMETRICS-558 <https://issues.jboss.org/browse/HWKMETRICS-558>)
- *REST API*
- Updated CORS validation to be applied prior to processing the
request; this solves an issue where some content is still returned even
thought a bad request status is returned (HWKMETRICS-554
<https://issues.jboss.org/browse/HWKMETRICS-554>)
- *Internal Monitoring*
- Hostname is now part of the metric id when creating and storing
internal metrics (HWKMETRICS-555
<https://issues.jboss.org/browse/HWKMETRICS-555>)
- *Hawkular Alerting - Updates*
- Added support for newer condition types to the email plugin (
HWKALERTS-208 <https://issues.jboss.org/browse/HWKALERTS-208>)
- Allow ExternalCondition to be fired on Event submission; external
conditions can now be matched via Event and Data submissions (
HWKALERTS-207 <https://issues.jboss.org/browse/HWKALERTS-207>)
- Added new NelsonCondition for native Nelson Rule detection; a brand
new condition type to perform automatic Nelson Rule detection of
misbehaving metrics. (HWKALERTS-209
<https://issues.jboss.org/browse/HWKALERTS-209>)
*Hawkular Alerting - included*
- Version 1.5.0
<https://issues.jboss.org/projects/HWKALERTS/versions/12332918>
- Project details and repository: Github
<https://github.com/hawkular/hawkular-alerts>
- Documentation: REST API
<http://www.hawkular.org/docs/rest/rest-alerts.html>, Examples
<https://github.com/hawkular/hawkular-alerts/tree/master/examples>,
Developer
Guide
<http://www.hawkular.org/community/docs/developer-guide/alerts.html>
*Hawkular Metrics Clients*
- Python: https://github.com/hawkular/hawkular-client-python
- Go: https://github.com/hawkular/hawkular-client-go
- Ruby: https://github.com/hawkular/hawkular-client-ruby
- Java: https://github.com/hawkular/hawkular-client-java
*Release Links*
Github Release:
https://github.com/hawkular/hawkular-metrics/releases/tag/0.23.0
JBoss Nexus Maven artifacts:
http://origin-repository.jboss.org/nexus/content/repositorie
s/public/org/hawkular/metrics/
Jira release tracker:
https://issues.jboss.org/projects/HWKMETRICS/versions/12332805
A big "Thank you" goes to John Sanda, Matt Wringe, Michael Burman, Joel
Takvorian, Jay Shaughnessy, Lucas Ponce, and Heiko Rupp for their project
contributions.
Thank you,
Stefan Negrea
7 years, 8 months
update your pom license headers before release
by John Mazzitelli
Just ran into this yesterday, so here is your annual reminder.
If you are responsible for releasing maven artifacts, you should update your pom.xml files right now to up the copyright date to 2017 in the license headers. Otherwise, the release plugin stuff will fail in the middle and you'll have to back out what you did up until that point, and then update your poms, and then try to release again.
Avoid that pain now :) and update your pom license headers now.
7 years, 8 months