Reducing foot print of embedded Cassandra
by John Sanda
Earlier Heiko asked me if there are any changes we can make to reduce the foot print of the embedded Cassandra instance we use for development and testing. There are several configuration changes that we can make which could be helpful. I did a quick check and it looks like the hawkular server is configured with a 512 MB heap. Some recommended values are based on that assumption.
* disable hinted handoff
There is no need to run hinted handoff with a single node. Set the hinted_handoff_enabled property in cassandra.yaml to false.
* disable key and counter caches
This may decrease read performance but should reduce Cassandra’s overall memory usage. Set the key_cache_size_in_mb and counter_cache_size_mb properties in cassandra.yaml to 0.
* reduce the concurrent reads/writes
This will reduce the number of threads that Cassandra uses. Set the concurrent_reads, concurrent_writes, and concurrent_counter_writes properties in cassandra.yaml to 2.
* reduce file cache size
This is off-heap memory used for pooling SSTable buffers. For a 512 MB heap it will be 128 MB. Change the file_cache_size_in_mb property to 64.
* reduce memory used for memtables
This will increase the number of flushes which should be fine for write-heavy workloads. We probably don’t want to make this too small for read-heavy scenarios. Change the memtable_heap_space_in_mb and memtable_offheap_space_in_mb properties to 64.
* reduce memory for index summaries
This defaults to a little over 25 MB. Change the index_summary_capcaity_in_mb property to 13.
* reduce the native transport threads
The native transport server will use a max of 128 threads by default. Change the native_transport_max_threads property to 32.
* reduce thrift server threads
Change the rpc_min_threads property to 8 and pc_max_threads to 512.
* tune compaction
Set concurrent_compactors to 1.
* disable gossip
No need to run gossip with a single node. Since Cassandra runs in its own deployment, the easiest way to do this is via JMX using the stopGossiping() method on StorageServiceMBean. I need to do some testing with this to see how it works. It looks like the call has to happen after some initialization has completed.
* disable compression
Compression is enabled by default and is configured per table. It reduces disk space but increases CPU utilization.
ALTER TABLE my_table WITH compression = { ’sstable_compression’: ‘’ };
8 years, 8 months
PoC hawkular wildfly agent and prometheus working
by John Mazzitelli
I have a PoC working with the agent scraping a Prometheus endpoint and storing the metric data in Hawkular-Metrics. It came together faster than I expected :)
This can be used as a standalone feature - in other words, you don't have to run the agent collecting everything (DMR + JMX + Platform + Prometheus data) though you can if you want a "uber-agent" to collect from multiple sources. But you can just have a minimally configured agent collect just Prometheus data. The config in standalone.xml would look like this (this is the entire agent subsystem config):
<subsystem xmlns="urn:org.hawkular.agent:agent:1.0" enabled="true">
<diagnostics enabled="false"/>
<storage-adapter password="password" serverOutboundSocketBindingRef="hawkular" type="HAWKULAR" username="jdoe"/>
<managed-servers>
<remote-prometheus enabled="true"
name="My Prometheus Endpoint"
url="http://my-prometheus-endpoint:8080/metrics"
interval="10"
timeUnits="seconds" />
<!-- you can have more <remote-prometheus> elements here to scrape multiple endpoints -->
</managed-servers>
</subsystem>
This is just a prototype. Still lots of questions to answer like:
1) What do we want the keys to look like in H-Metrics? How to translate Prometheus keys/labels to H-Metrics? What this PoC does right now is build a huge H-Metrics key that appends the feed ID, the Prometheus metric name and Prometheus labels so it looks like this "b6fb9814-e6b5-435a-b533-411d85867720_prometheus_rule_evaluation_failures_total_rule_type=alerting_" (rule_type=alerting is a label).
2) Prometheus supports summary and histograms - what do we do with those? I ignore them for now. (btw: even histograms isn't supported in Prometheus' Java model API that is released, only supported in local snapshot build form only)
3) Do we want to create labels in H-Metrics to distinguish the Prometheus endpoint data? How?
4) No security whatsoever in here - PoC assumes in-the-clear http access. We'd need to add https, basic auth, ssl/security realms and anything else needed to access any Prometheus endpoints that are secured. I'm sure Open Shift has some things we need to work around to scrape Prometheus endpoints that are running in an OS environment - this is something we need to figure out.
5) Notice there is no metadata in the above config. There is no integration with H-Inventory - no resource types, no resources, no metric types. Prometheus doesn't really have the notion of inventory - at best, each Prometheus endpoint could be a "resource". But this PoC does nothing like that - it is a metrics-only solution (which as I understand it is all we need for the Open Shift requirement anyway).
And I'm sure plenty more questions. But, point is, we have a working example of how this can be done. Code is in this branch:
https://github.com/hawkular/hawkular-agent/tree/mazz/hwkagent-66-dynamic-...
--John Mazz
8 years, 8 months
java-based prometheus metrics scraper
by John Mazzitelli
FYI: I finished a Java-based Prometheus scraper - seems to be working. It is just code that scrapes any remote Prometheus endpoint given a URL and let's you easily walk the metric data for further processing (like, if you want to insert that data into Hawkular Metrics or if you want to dump the data in XML or JSON format - the CLI does this today, see below). I'll eventually blog about this, but not until I actually merge it into master. It's in my branch right now here:
https://github.com/hawkular/hawkular-agent/tree/mazz/hwkagent-66-promethe...
This also includes a CLI tool - build the scraper maven module and you'll get an uber-jar that let's you scrape a remote Prometheus endpoint and dump the metric data in XML, JSON, or simple form. See the README.md for more.
Now I just have to figure out how to integrate this into the Hawkular WildFly Agent :-)
8 years, 8 months
hawkular-agent, prometheus, openshift
by John Mazzitelli
I'm trying to figure out how the Hawkular WildFly Agent needs to be enhanced to collect metrics from Prometheus (which is where a lot of Open Shift metrics are going).
Here is how I originally understood the problem (which may be completely wrong):
I am looking at this: https://prometheus.io/docs/querying/api/
So if Open Shift components are storing metrics in Prometheus, the Agent would need to query the data via something like:
http://localhost:9090/api/v1/query?query=my_metric_name_seconds{label_one...
The agent can take the data it gets and store it in Hawkular Metrics (using a different metric name and/or labels if we want).
I am hoping Matt W. can clarify and if this is completely wrong, how does he see it working?
--John Mazz
8 years, 8 months
Hawkular Alerts 1.x Roadmap
by Jay Shaughnessy
Today, Hawkular Alerts 1.0 offers a rich and comprehensive set of features for adding alerts and events to an application:
* Robust REST API
* Complex Alerting
* Powerful Event Handling
* Pluggable Action Handlers with many provided out-of-box (e-mail,
SMS, etc)
* Alert Lifecycle (Ack, Resolved, etc)
* Flexible Deployment, either standalone or as part of Hawkular
* Group trigger management
* Much More...
This is the proposed roadmap for upcoming Hawkular Alerts 1.x releases. The main goal of the 1.x releases will be hardening of existing features, upgrades aligned with the main Hawkular project, fixes and enhancements identified by the community.
Currently planned or proposed:
* Upgrade to Cassandra 3 (as part of common upgrade in all Hawkular
components).
o
Review and optimize schema
*
Improvements in the Actions plugins architecture.
o
Add plugins properties validation.
o
Introduce deployed/undeployed status on actions definitions.
*
Extend listeners support to clustering scenarios.
*
Improve REST API documentation and add additional examples.
There is no release date established for Hawkular Alerts 1.1, releases
will be made as demanded by the Hawkular project or other community needs.
Please let us know if you have questions or suggested additions. Thanks!
Hawkular Alerts Team
Jay Shaughnessy (jshaughn(a)redhat.com)
Lucas Ponce (lponce(a)redhat.com)
8 years, 8 months
Titan 0.5.4 with Cassandra 3.3
by Peter Palaga
Hi *,
Short version: I want to release our own fixed Titan 0.5.4-jboss-1 that
works with Cassandra 3.3
Long version:
Titan is a graph database used by Hawkular Inventory. ATM, we use Titan
0.5.4 with Cassandra 2.x.
As I already noted elsewhere, Titan 0.5.4 requires Guava 15 that is
incompatible with Guava 18 required by the newest c* 3.x Driver. The
conflict seems to be solved in the titan05 branch [1] in the
thinkaurelius repo. I asked both their release engineer directly and on
their mailing list [2] if they could release Titan 0.5.5, but got no
reply so far. It does not look like we can hope for a release from them
soon.
Therefore, I propose the following:
* I'll fork Titan code from their 0.5.4 tag
* I'll apply the changes necessary for Titan 0.5.4 to work with
Cassandra 3.3
* I'll change the groupId from com.thinkaurelius.titan to org.hawkular.titan
* I'll release the result as Titan 0.5.4-jboss-1
This should be just a temporary solution till Inventory is migrated to
Titan 1.0 API that should work with the newest Cassandra.
Are there any concerns about this proposal?
[1] https://github.com/thinkaurelius/titan/tree/titan05
[2]
https://groups.google.com/forum/#!searchin/aureliusgraphs/0.5.5/aureliusg...
Thanks,
Peter
8 years, 8 months
Hawkular Metrics - /data enpoint
by Stefan Negrea
Hello Everybody,
I submitted today a PR for a long standing JIRA:
https://issues.jboss.org/browse/HWKMETRICS-24 ; here is a related one:
https://issues.jboss.org/browse/HWKMETRICS-57 , and the PR:
https://github.com/hawkular/hawkular-metrics/pull/473
The JIRAs recommend splitting '*/data' endpoint in two separate endpoints,
'/raw' and '/stats'. There are two issues with the current '*/data'. First,
the code was really hard to maintain because it was serving two purposes.
Depending on the input parameters, the endpoint would serve raw ingested
data in some cases and bucketed results in some other cases. The PR still
has the old code (since it got just deprecated in this release) and it's
not pretty. The second problem was the actual API interface. There was no
simple way to know exactly what you get back from the endpoint because
sometimes it would return simple data points and sometimes bucketed data.
This was based on the query parameters specified in the request. And to
make things worse, some parameters could not be mixed, for example, the
user could not limit or order bucketed results; but the documentation had
to include all under the same endpoint.
The plan is to deprecate existing '*/data' endpoints in the upcoming
release (0.15.0) and remove them in the release after that (0.16.0). That
gives enough time (roughly 2 months) for all existing projects to migrate
to the newer API. Does anybody see any problem with this timeline?
I expect the transition to be simpler because most of the consumers were
using '*/data' with the intent to get bucketed results, which is now
'*/stats'. So it is just a simple change in most cases.
Are there any projects negatively affected by this change in the long-term?
Does the change make sense? Is the API interface for retrieving data easier
to understand after the change?
Thank you,
Stefan Negrea
Software Engineer
8 years, 8 months
A better mechanism to import events from Hawkular to MiQ?
by Jay Shaughnessy
Jkremser, and anyone,
So I've now got some code working that can move Hawkular events (events
stored via H Alerts) into MiQ as MiQ Events. As a note, not every
Hawkular event will be a MiQ event because MiQ requires that supported
event types are predefined. This mail is more about the mechanism used
to move the events...
I've started with a primitive mechanism, it's a looping REST Get with an
advancing time window, using the H Alerts rest api via the
hawkular-ruby-client. This has a variety of potential issues, for example:
* Any Hawkular events generated when MiQ isn't running will likely be
missed.
* If the timestamps reported on the events are behind the MiQ polling
window they will be missed. (late arrival, possibly a hawkular
server time a bit behind MiQ server time).
* Potentially excessive polling if the number of events is not large.
Certainly some of these issues could be softened with a little more
provider-side smarts, like querying more into the past and protecting
against duplicate event storage, etc. But I'm wondering what thoughts
people may have on a better mechanism. I know other providers in MiQ
use a variety of techniques to import data, from polling, to blocking
HTTP requests, to queue listeners. I should mention that the general
approach of an MiQ provider is to provide an "Event Catcher", which runs
in a handler process for each provider instance. The catcher is
basically told by MiQ to go get events and then queues them for MiQ
consumption. Let me know what you think. Also, if anyone would like
to see a short demo of what I have right now I'd be happy to run a short
meeting.
8 years, 8 months
Termin gestrichen: Hawkular-Team Update - Di 5. Apr. 2016 3:30PM - 4PM (hrupp@redhat.com)
by hrupp@redhat.com
Dieser Termin wurde gestrichen und aus Ihrem Kalender entfernt.
Titel: Hawkular-Team Update
This is a all-Hawkular team meeting to give updates where we are and so on.
This is *open to the public*.
Location:
on bluejeans: https://redhat.bluejeans.com/hrupp/
or alternatively teleconference Reservationless+ , passcode 204 2160 481
You can find Dial-In telephone numbers here:
https://www.intercallonline.com/listNumbersByCode.action?confCode=2042160481
RedHat internal short dial numbers are 16666 and 15555 (and probably
others, depends on your location)
Wann: Di 5. Apr. 2016 3:30PM - 4PM Berlin
Wo: pc 204 2160 481
Kalender: hrupp(a)redhat.com
Wer
* Heiko Rupp - Organisator
* Mike Thompson
* Gabriel Cardoso
* gbrown(a)redhat.com
* Jiri Kremser
* Thomas Segismont
* miburman(a)redhat.com
* theute(a)redhat.com
* Simeon Pinder
* Peter Palaga
* lkrejci(a)redhat.com
* snegrea(a)redhat.com
* John Mazzitelli
* hawkular-dev(a)lists.jboss.org
* jcosta(a)redhat.com
* John Sanda
* amendonc(a)redhat.com
* Jay Shaughnessy
* Lucas Ponce
* mwringe(a)redhat.com
Einladung von Google Kalender: https://www.google.com/calendar/
Sie erhalten diese E-Mail unter hawkular-dev(a)lists.jboss.org, da Sie ein
Gast bei diesem Termin sind.
Lehnen Sie diesen Termin ab, um keine weiteren Informationen zu diesem
Termin zu erhalten. Sie können auch unter https://www.google.com/calendar/
ein Google-Konto erstellen und Ihre Benachrichtigungseinstellungen für
Ihren gesamten Kalender steuern.
Wenn Sie diese Einladung weiterleiten, kann jeder Empfänger Ihre Antwort
auf die Einladung ändern. Weitere Informationen finden Sie unter
https://support.google.com/calendar/answer/37135#forwarding
8 years, 8 months