agents
by John Mazzitelli
[sending to dev list]
If you wrote an agent/feed (even if its something in your own little corner of the world in your github repo), let me know where it is, what it is, what it can do now, what you hoped it could do in the future. Don't assume I know anything about it, because I probably don't.
I just want to get a "lay of the land" - what is out there, what people are thinking about wrt agents, what we have today that we can build on. That kind of thing.
Thanks,
John Mazz
7 years, 1 month
Ideas about CD with Maven
by Peter Palaga
Hi all interested in continuous delivery and integration,
Contrary to srcdeps, this guy [1] does not try to avoid releases and
maven repos, instead, he just makes the release process extremely
effective. Some of his ideas can be used to improve the srcdeps plugin.
[1] https://axelfontaine.com/blog/dead-burried.html
Best,
Peter
7 years, 1 month
Reducing foot print of embedded Cassandra
by John Sanda
Earlier Heiko asked me if there are any changes we can make to reduce the foot print of the embedded Cassandra instance we use for development and testing. There are several configuration changes that we can make which could be helpful. I did a quick check and it looks like the hawkular server is configured with a 512 MB heap. Some recommended values are based on that assumption.
* disable hinted handoff
There is no need to run hinted handoff with a single node. Set the hinted_handoff_enabled property in cassandra.yaml to false.
* disable key and counter caches
This may decrease read performance but should reduce Cassandra’s overall memory usage. Set the key_cache_size_in_mb and counter_cache_size_mb properties in cassandra.yaml to 0.
* reduce the concurrent reads/writes
This will reduce the number of threads that Cassandra uses. Set the concurrent_reads, concurrent_writes, and concurrent_counter_writes properties in cassandra.yaml to 2.
* reduce file cache size
This is off-heap memory used for pooling SSTable buffers. For a 512 MB heap it will be 128 MB. Change the file_cache_size_in_mb property to 64.
* reduce memory used for memtables
This will increase the number of flushes which should be fine for write-heavy workloads. We probably don’t want to make this too small for read-heavy scenarios. Change the memtable_heap_space_in_mb and memtable_offheap_space_in_mb properties to 64.
* reduce memory for index summaries
This defaults to a little over 25 MB. Change the index_summary_capcaity_in_mb property to 13.
* reduce the native transport threads
The native transport server will use a max of 128 threads by default. Change the native_transport_max_threads property to 32.
* reduce thrift server threads
Change the rpc_min_threads property to 8 and pc_max_threads to 512.
* tune compaction
Set concurrent_compactors to 1.
* disable gossip
No need to run gossip with a single node. Since Cassandra runs in its own deployment, the easiest way to do this is via JMX using the stopGossiping() method on StorageServiceMBean. I need to do some testing with this to see how it works. It looks like the call has to happen after some initialization has completed.
* disable compression
Compression is enabled by default and is configured per table. It reduces disk space but increases CPU utilization.
ALTER TABLE my_table WITH compression = { ’sstable_compression’: ‘’ };
7 years, 1 month
PoC hawkular wildfly agent and prometheus working
by John Mazzitelli
I have a PoC working with the agent scraping a Prometheus endpoint and storing the metric data in Hawkular-Metrics. It came together faster than I expected :)
This can be used as a standalone feature - in other words, you don't have to run the agent collecting everything (DMR + JMX + Platform + Prometheus data) though you can if you want a "uber-agent" to collect from multiple sources. But you can just have a minimally configured agent collect just Prometheus data. The config in standalone.xml would look like this (this is the entire agent subsystem config):
<subsystem xmlns="urn:org.hawkular.agent:agent:1.0" enabled="true">
<diagnostics enabled="false"/>
<storage-adapter password="password" serverOutboundSocketBindingRef="hawkular" type="HAWKULAR" username="jdoe"/>
<managed-servers>
<remote-prometheus enabled="true"
name="My Prometheus Endpoint"
url="http://my-prometheus-endpoint:8080/metrics"
interval="10"
timeUnits="seconds" />
<!-- you can have more <remote-prometheus> elements here to scrape multiple endpoints -->
</managed-servers>
</subsystem>
This is just a prototype. Still lots of questions to answer like:
1) What do we want the keys to look like in H-Metrics? How to translate Prometheus keys/labels to H-Metrics? What this PoC does right now is build a huge H-Metrics key that appends the feed ID, the Prometheus metric name and Prometheus labels so it looks like this "b6fb9814-e6b5-435a-b533-411d85867720_prometheus_rule_evaluation_failures_total_rule_type=alerting_" (rule_type=alerting is a label).
2) Prometheus supports summary and histograms - what do we do with those? I ignore them for now. (btw: even histograms isn't supported in Prometheus' Java model API that is released, only supported in local snapshot build form only)
3) Do we want to create labels in H-Metrics to distinguish the Prometheus endpoint data? How?
4) No security whatsoever in here - PoC assumes in-the-clear http access. We'd need to add https, basic auth, ssl/security realms and anything else needed to access any Prometheus endpoints that are secured. I'm sure Open Shift has some things we need to work around to scrape Prometheus endpoints that are running in an OS environment - this is something we need to figure out.
5) Notice there is no metadata in the above config. There is no integration with H-Inventory - no resource types, no resources, no metric types. Prometheus doesn't really have the notion of inventory - at best, each Prometheus endpoint could be a "resource". But this PoC does nothing like that - it is a metrics-only solution (which as I understand it is all we need for the Open Shift requirement anyway).
And I'm sure plenty more questions. But, point is, we have a working example of how this can be done. Code is in this branch:
https://github.com/hawkular/hawkular-agent/tree/mazz/hwkagent-66-dynamic-...
--John Mazz
7 years, 1 month
java-based prometheus metrics scraper
by John Mazzitelli
FYI: I finished a Java-based Prometheus scraper - seems to be working. It is just code that scrapes any remote Prometheus endpoint given a URL and let's you easily walk the metric data for further processing (like, if you want to insert that data into Hawkular Metrics or if you want to dump the data in XML or JSON format - the CLI does this today, see below). I'll eventually blog about this, but not until I actually merge it into master. It's in my branch right now here:
https://github.com/hawkular/hawkular-agent/tree/mazz/hwkagent-66-promethe...
This also includes a CLI tool - build the scraper maven module and you'll get an uber-jar that let's you scrape a remote Prometheus endpoint and dump the metric data in XML, JSON, or simple form. See the README.md for more.
Now I just have to figure out how to integrate this into the Hawkular WildFly Agent :-)
7 years, 1 month
hawkular-agent, prometheus, openshift
by John Mazzitelli
I'm trying to figure out how the Hawkular WildFly Agent needs to be enhanced to collect metrics from Prometheus (which is where a lot of Open Shift metrics are going).
Here is how I originally understood the problem (which may be completely wrong):
I am looking at this: https://prometheus.io/docs/querying/api/
So if Open Shift components are storing metrics in Prometheus, the Agent would need to query the data via something like:
http://localhost:9090/api/v1/query?query=my_metric_name_seconds{label_one...
The agent can take the data it gets and store it in Hawkular Metrics (using a different metric name and/or labels if we want).
I am hoping Matt W. can clarify and if this is completely wrong, how does he see it working?
--John Mazz
7 years, 1 month
Hawkular Alerts 1.x Roadmap
by Jay Shaughnessy
Today, Hawkular Alerts 1.0 offers a rich and comprehensive set of features for adding alerts and events to an application:
* Robust REST API
* Complex Alerting
* Powerful Event Handling
* Pluggable Action Handlers with many provided out-of-box (e-mail,
SMS, etc)
* Alert Lifecycle (Ack, Resolved, etc)
* Flexible Deployment, either standalone or as part of Hawkular
* Group trigger management
* Much More...
This is the proposed roadmap for upcoming Hawkular Alerts 1.x releases. The main goal of the 1.x releases will be hardening of existing features, upgrades aligned with the main Hawkular project, fixes and enhancements identified by the community.
Currently planned or proposed:
* Upgrade to Cassandra 3 (as part of common upgrade in all Hawkular
components).
o
Review and optimize schema
*
Improvements in the Actions plugins architecture.
o
Add plugins properties validation.
o
Introduce deployed/undeployed status on actions definitions.
*
Extend listeners support to clustering scenarios.
*
Improve REST API documentation and add additional examples.
There is no release date established for Hawkular Alerts 1.1, releases
will be made as demanded by the Hawkular project or other community needs.
Please let us know if you have questions or suggested additions. Thanks!
Hawkular Alerts Team
Jay Shaughnessy (jshaughn(a)redhat.com)
Lucas Ponce (lponce(a)redhat.com)
7 years, 1 month
Titan 0.5.4 with Cassandra 3.3
by Peter Palaga
Hi *,
Short version: I want to release our own fixed Titan 0.5.4-jboss-1 that
works with Cassandra 3.3
Long version:
Titan is a graph database used by Hawkular Inventory. ATM, we use Titan
0.5.4 with Cassandra 2.x.
As I already noted elsewhere, Titan 0.5.4 requires Guava 15 that is
incompatible with Guava 18 required by the newest c* 3.x Driver. The
conflict seems to be solved in the titan05 branch [1] in the
thinkaurelius repo. I asked both their release engineer directly and on
their mailing list [2] if they could release Titan 0.5.5, but got no
reply so far. It does not look like we can hope for a release from them
soon.
Therefore, I propose the following:
* I'll fork Titan code from their 0.5.4 tag
* I'll apply the changes necessary for Titan 0.5.4 to work with
Cassandra 3.3
* I'll change the groupId from com.thinkaurelius.titan to org.hawkular.titan
* I'll release the result as Titan 0.5.4-jboss-1
This should be just a temporary solution till Inventory is migrated to
Titan 1.0 API that should work with the newest Cassandra.
Are there any concerns about this proposal?
[1] https://github.com/thinkaurelius/titan/tree/titan05
[2]
https://groups.google.com/forum/#!searchin/aureliusgraphs/0.5.5/aureliusg...
Thanks,
Peter
7 years, 1 month