Hawkular Alerting 1.1.3.Final has been released!
by Jay Shaughnessy
Hawkular Alerting 1.1.3.Final has been released!
* [HWKALERTS-155] - Failed schema creation because of
OperationTimedOutException
o A useful bug fix
* [HWKALERTS-156] - Propagate type from metrics to alerting
o This is part of the overall hawkular-1102 solution, the alerting
side of improved filtering between metrics and alerting
* [HWKALERTS-157] - Sequential processing of ConditionEval could
create on false alerting
o *Important* bug fix for clients using multi-condition triggers!
For more details for this release:
https://issues.jboss.org/projects/HWKALERTS/versions/12331263
Hawkular Alerting Team
Jay Shaughnessy (jshaughn(a)redhat.com)
Lucas Ponce (lponce(a)redhat.com)
8 years
Re: [Hawkular-dev] Hawkular Services 0.0.12.Final released
by Heiko W.Rupp
And now the same for 0.12 with updates to alerts and metrics.
Grab it while it is hot :)
https://github.com/hawkular/hawkular-services/releases/tag/0.0.12.Final
On 6 Sep 2016, at 11:44, Juraci Paixão Kröhling wrote:
> Team,
>
> Hawkular Services 0.0.11.Final has (finally) been released.
>
> As the previous distributions, the Agent has to be configured with an
> user. This can be accomplished by:
>
> - Adding an user via bin/add-user.sh like:
> ./bin/add-user.sh \
> -a \
> -u <theusername> \
> -p <thepassword> \
> -g read-write,read-only
>
> - Changing the Agent's credential on standalone.xml to the credentials
> from the previous step or by passing hawkular.rest.user /
> hawkular.rest.password as system properties
> (-Dhawkular.rest.user=jdoe)
>
> You can find the release packages, sources and checksums at GitHub, in
> addition to Maven:
>
> https://github.com/hawkular/hawkular-services/releases/tag/0.0.11.Final
>
> Shortcuts for the downloads:
> Zip - https://git.io/viGKO
> tar.gz - https://git.io/viGKL
>
> - Juca.
> _______________________________________________
> hawkular-dev mailing list
> hawkular-dev(a)lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hawkular-dev
>
> _______________________________________________
> hawkular-dev mailing list
> hawkular-dev(a)lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hawkular-dev
--
Reg. Adresse: Red Hat GmbH, Technopark II, Haus C,
Werner-von-Siemens-Ring 14, D-85630 Grasbrunn
Handelsregister: Amtsgericht München HRB 153243
Geschäftsführer: Charles Cachera, Michael Cunningham, Michael O'Neill,
Eric Shander
8 years
Hawkular Metrics 0.19.0 - Release
by Stefan Negrea
Hello Everybody,
I am happy to announce release 0.19.0 of Hawkular Metrics.This release is
anchored by performance enhancements and a lot of REST API enhancements.
Here is a list of major changes:
1.
*REST API - Query Improvements*
- It is now possible to use relative timestamps when querying for data
via `+` or `-` added to timestamp parameters. When used, the
timestamp is relative to the local system timestamp (HWKMETRICS-358
<https://issues.jboss.org/browse/HWKMETRICS-358>, HWKMETRICS-457
<https://issues.jboss.org/browse/HWKMETRICS-457>)
- Querying for data from earliest available data point has been
expanded to raw data queries for gauges and counters (HWKMETRICS-435
<https://issues.jboss.org/browse/HWKMETRICS-435>)
- `DELETE tenants/{id}` has been added to allow the deletion of an
entire tenant (HWKMETRICS-446
<https://issues.jboss.org/browse/HWKMETRICS-446>)
- AvailabilityType is serialized as simple string in bucket data
points (HWKMETRICS-436
<https://issues.jboss.org/browse/HWKMETRICS-436>)
2.
*Performance Enhancements*
- The write performance has been increased by 10% across the board after
reorganizing the meta-data internal indexes. (HWKMETRICS-422
<https://issues.jboss.org/browse/HWKMETRICS-422>)
- GZIP replaced LZ4 as the compression algorithm for the data table.
This produces anywhere between 60% to 70% disk usage savings over LZ4. (
HWKMETRICS-454 <https://issues.jboss.org/browse/HWKMETRICS-454>)
3.
*Job Scheduler - Improvements*
- The newly introduced internal job scheduler received several
improvements and fixes; the main focus was to enhance the scalability and
fault tolerance.
- The job scheduler will be used only for internal tasks.
- For more details: HWKMETRICS-221
<https://issues.jboss.org/browse/HWKMETRICS-221>, HWKMETRICS-451
<https://issues.jboss.org/browse/HWKMETRICS-451>, HWKMETRICS-453
<https://issues.jboss.org/browse/HWKMETRICS-453>, HWKMETRICS-94
<https://issues.jboss.org/browse/HWKMETRICS-94>
*Hawkular Metrics Clients*
- Python: https://github.com/hawkular/hawkular-client-python
- Go: https://github.com/hawkular/hawkular-client-go
- Ruby: https://github.com/hawkular/hawkular-client-ruby
- Java: https://github.com/hawkular/hawkular-client-java
*Release Links*
Github Release:
https://github.com/hawkular/hawkular-metrics/releases/tag/0.19.0
JBoss Nexus Maven artifacts:
http://origin-repository.jboss.org/nexus/content/repositories/public/org/...
Jira release tracker:
https://issues.jboss.org/browse/HWKMETRICS/fixforversion/12331192
A big "Thank you" goes to John Sanda, Thomas Segismont, Mike Thompson, Matt
Wringe, Michael Burman, Joel Takvorian, and Heiko Rupp for their project
contributions.
Thank you,
Stefan Negrea
8 years
managing cassandra cluster
by John Sanda
To date we haven’t really done anything by way of managing/monitoring the Cassandra cluster. We need to monitor Cassandra in order to know things like:
* When additional nodes are needed
* When disk space is low
* When I/O is too slow
* When more heap space is needed
Cassandra exposes a lot of metrics. I created HWKMETRICS-448. It briefly talks about collecting metrics from Cassandra. In terms of managing the cluster, I will provide a few concrete examples that have come up recently in OpenShift.
Scenario 1: User deploys additional node(s) to reduce the load on cluster
After the new node has bootstrapped and is running, we need to run nodetool cleanup on each node (or run it via JMX) in order to remove keys/data that each each node no longer owns; otherwise, disk space won’t be freed up. The cleanup operation can potentially be resource intensive as it triggers compactions. Given this, we probably want to run it one node at a time. Right now the user is left to do this manually.
Scenario 2: User deploys additional node(s) to get replication and fault tolerance
I connect to Cassandra directly via cqlsh and update replication_factor. I then need to run repair on each node can be tricky because 1) it is resource intensive, 2) can take a long time, 3) prone to failure, and 4) Cassandra does not give progress indicators.
Scenario 3: User sets up regularly, scheduled repair to ensure data is consistent across cluster
Once replication_factor > 1, repair needs to be run on a regular basis. More specifically it should be run within gc_grace_seconds which is configured per table and defaults to 10 days. The data table in metrics has reduced gc_grace_seconds to 1 day and probably reduce it to zero since it is append-only. The value for gc_grace_seconds might vary per table based on access patterns, which means the frequency of repair should vary as well.
There has already been some discussion of these things for Hawkular Metrics in the context of OpenShift. It applies to all of Hawkular Services as well. Initially I was thinking about building some management components directly in metrics, but it probably makes more sense as a separate, shared component (or components) that can be reused in both stand alone metrics in OpenShift and a full Hawkular Services deployment in MiQ for example.
We are already running into these scenarios in OpenShift and probably need to start putting something in place sooner rather than later.
8 years
[Inventory] Performance of Tinkerpop3 backends
by Lukas Krejci
Hi all,
to move inventory forward, we need to port it to Tinkerpop3 - a new(ish) and
actively maintained version of the Tinkerpop graph API.
Apart from the huge improvement in the API expressiveness and capabilities,
the important thing is that it comes with a variety of backends, 2 of which
are of particular interest to us ATM. The Titan backend (with Titan in version
1.0) and SQL backend (using the sqlg library).
The SQL backend is a much improved (yet still unfinished in terms of
optimizations and some corner case features) version of the toy SQL backend
for Tinkerpop2.
Back in March I ran performance comparisons for SQL/postgres and Titan (0.5.4)
on Tinkerpop2 and concluded that Titan was the best choice then.
After completing a simplistic port of inventory to Tinkerpop3 (not taking
advantage of any new features or opportunities to simplify inventory
codebase), I've run the performance tests again for the 2 new backends - Titan
1.0 and Sqlg (on postgres).
This time the results are not so clear as the last time.
>From the charts [1] you can see that Postgres is actually quite a bit faster
on reads and can better handle concurrent read access while Titan shines in
writes (arguably thanks to Cassandra as its storage).
Of course, I can imagine that the read performance advantage of Postgres would
decrease with the growing amount of data stored (the tests ran with the
inventory size of ~10k entities) but I am quite positive we'd get competitive
read performance from both solutions up to the sizes of inventory we
anticipate (100k-1M entities).
Now the question is whether the insert performance is something we should be
worried about in Postgres too much. IMHO, there should be some room for
improvement in Sqlg and also our move to /sync for agent synchronization would
make this less of a problem (because there would be not that many initial
imports that would create vast amounts of entities).
Nevertheless I currently cannot say who is the "winner" here. Each backend has
its pros and cons:
Titan:
Pros:
- high write throughput
- backed by cassandra
Cons:
- slower reads
- project virtually dead
- complex codebase (self-made fixes unlikely)
Sqlg:
Pros:
- small codebase
- everybody knows SQL
- faster reads
- faster concurrent reads
Cons:
- slow writes
- another backend needed (Postgres)
Therefore my intention here is to go forward with a "proper" port to
Tinkerpop3 with Titan still enabled but focus primarily on Sqlg to see if we
can do anything with the write performance.
IMHO, any choice we make is "workable" as it is even today but we need to
weigh in the productization requirements. For those Sqlg with its small dep
footprint and postgres backend seems preferable to the huge dependency mess of
Titan.
[1] https://dashboards.ly/ua-TtqrpCXcQ3fnjezP5phKhc
--
Lukas Krejci
8 years, 1 month
Assertj framework
by Joel Takvorian
Hello,
I'd like to propose the addition of the "assertj" lib in test scope for
hawkular-metrics (and/or other modules).
If you don't know it, assertj is basically a "fluent assertion framework".
You write assertions like :
*assertThat(result).extracting(DataPoint::getTimestamp).containsExactly(0L,
1L, 5L, 7L, 10L);*
It has a very rich collection of assertions, I find it particularly
powerful when working on collections, whether they are sorted or not,
whether they have nested objects or not.
Also, the fact that it's "fluent" helps a lot when you write a test,
because your IDE auto-completion will help you a lot to find what's the
assertion you're looking for - something that is not so easy with hamcrest.
I see it as a kind of virtuous cycle: tests are easier to write, easier to
read, so you write more tests / better tests.
Would you be ok to give it a try?
If you want to read about it: official site is there
http://joel-costigliola.github.io/assertj/
or an article that promotes it nicely :
https://www.infoq.com/articles/custom-assertions
or an example of file I'd like to push, if you're ok with assertj :) :
https://github.com/jotak/hawkular-metrics/blob/72e2f95f7e19c9433ce44ee83d...
Joel
8 years, 1 month