Hawkular Metrics 0.16.0 - Release
by Stefan Negrea
Hello Everybody,
I am happy to announce release 0.16.0 of Hawkular Metrics. This release is
anchored by overall enhancements in the API and updates to String metric
type.
Here is a list of major changes:
1.
String Metric Type - Enhancements
- /strings endpoint was enhanced with querying capabilities similar to
other metric types, including tag related features (HWKMETRICS-402
<https://issues.jboss.org/browse/HWKMETRICS-402>)
- The endpoints under /strings are still experimental, so changes
which break backwards compatibility could be introduced in
future releases.
The experimental tag allows time for feedback to better
determine what the
API should be.
2.
*/stats & */raw Replace */data - second deprecation warning!
- */data has been deprecated and functionality split into two single
purpose endpoints, this is applicable for all metric types
(gauge, counter,
availability, and string)
- */stats endpoints return bucketed, statistical or query-time
aggregated data
- */raw endpoints accept and return raw data for a metric
- Please update your code to use the new endpoints and follow the
release notes for more details regarding removal.
- For more details: HWKMETRICS-24
<https://issues.jboss.org/browse/HWKMETRICS-24>, HWKMETRICS-57
<https://issues.jboss.org/browse/HWKMETRICS-57>
3.
REST API Updates
- Empty buckets are now reported with default values and no samples (
HWKMETRICS-345 <https://issues.jboss.org/browse/HWKMETRICS-345>)
- Rate of change stats can be retrieved for Gauge metrics. This
feature was exclusive for Counter metrics but was expanded to Gauge
metrics. The rates are computed at query time based on stored data (
HWKMETRICS-365 <https://issues.jboss.org/browse/HWKMETRICS-365>)
- Min and max timestamps of stored datapoints are now returned when
querying for metric definitions (HWKMETRICS-383
<https://issues.jboss.org/browse/HWKMETRICS-383>)
- The endpoint for fetching rates now supports standard query
parameters and sort behavior (HWKMETRICS-390
<https://issues.jboss.org/browse/HWKMETRICS-390>)
4.
Hawkular Metrics - Hawkular Services distribution
- Hawkular Metrics distribution build for inclusion in Hawkular Services
is now independent of Hawkular Accounts.
- Only for this distribution, the authentication is done at container
level and tenant id header becomes required.
- For more details: HWKMETRICS-399
<https://issues.jboss.org/browse/HWKMETRICS-399>, HWKMETRICS-401
<https://issues.jboss.org/browse/HWKMETRICS-401>
Hawkular Metrics Clients
- Python: https://github.com/hawkular/hawkular-client-python
- Go: https://github.com/hawkular/hawkular-client-go
- Ruby: https://github.com/hawkular/hawkular-client-ruby
- Java: https://github.com/hawkular/hawkular-client-java
Release Links
Github Release:
https://github.com/hawkular/hawkular-metrics/releases/tag/0.16.0
JBoss Nexus Maven artifacts:
http://origin-repository.jboss.org/nexus/content/repositories/public/org/...
Jira release tracker:
https://issues.jboss.org/browse/HWKMETRICS/fixforversion/12330316
A big "Thank you" goes to John Sanda, Thomas Segismont, Matt Wringe, Mike
Thompson, Michael Burman, and Heiko Rupp for their project contributions.
Thank you,
Stefan Negrea
Software Engineer
7 years, 10 months
SSL by default
by Juraci Paixão Kröhling
Team,
I just sent a PR for hawkular-services [1] that adds SSL support by
default to the distribution.
I'd like you to take a moment and do a couple of simple tests of your
component against this distribution, specially if you perform REST calls
to a component endpoint.
Apart from the Agent, I'm not aware of any REST calls made by individual
components, but I'd need to be aware of any problems that this change
might cause.
My next step is to change the agent to accept certs on our keystore.
A few comments:
- The HTTP port is not redirecting to HTTPS yet. This might require
changes to the individual component's web.xml , which I'll be adding soon.
- The certificate inside the keystore is a self-signed one. Should we
ship it on the main distribution, with instructions telling users to
replace our certificate with a real one? Or should we *not* ship it?
Related question: are we even allowed to ship such keystores?
- As mentioned in the previous point, the cert is self-signed. So, you
might need to add "-k" to curl to bypass the cert verification.
- Authentication with client cert is not yet available.
1 - https://github.com/hawkular/hawkular-services/pull/2
- Juca.
7 years, 10 months
Reform Inventory REST API
by Lukas Krejci
Hi everyone,
tl;dr
Inventory's REST API is ambiguous and doesn't reflect the generic structure of
the inventory well. Let's change that before it's too late!
The access patterns in inventory's REST API predate the concept of the
canonical path that we now use extensively throughout the model to uniquely
identify the entities and because of that we're running into various issues
with the REST API. From slight inconveniences to outright breakage due to
ambiguous URLs.
Here I propose a reformed REST API centered around the canonical paths. It
should have the same expressive power as the original REST API but should not
suffer from the ambiguities and should be much more cohesive and "logical".
The only thing that was possible using 1 call in the old API that will require
2 calls in the new API is disassociation of 2 entities (i.e. disassociate a
metric from a resource, etc.). These operations are IMHO rather rare so I am
not too worried about this.
I'd like to know your opinions on the new API:
URIs below are defined in EBNF,
CP stands for canonical path of some entity,
REL stands for a name of some relationship (pre-defined or user defined)
== sync endpoint
URI = "/", "sync", "/", CP;
This is the same as it is currently. There is a parallel thread from the last
week about the evolution of sync that will be addressed, too.
== bulk endpoint
URI = "/", "bulk";
might go away - we have sync doing almost the same thing
== GET URLs
This is a little bit complex but what this does is that it enhances a
canonical path as is currently known with the ability to define filters on
each path progression step.
Basically, this is an attempt to express a graph traversal using an URL.
ANY = ? just a URI path-escaped string representing an entity ID or
relationship name ? ;
FILTER_NAME = "type" | "id" | "name" | "cp" | "identical" | "propertyName" |
"propertyValue" ;
FILTER = FILTER_NAME , [ "=", ANY ] ; (* value is not required for the
"identical" filter *)
PATH_SEGMENT_TYPE = "t" | "e" | "mp" | "f" | "rt" | "mt" | "ot" | "r" | "m" |
"d" ;
PATH_SEGMENT_ID = ANY ;
PATH_STEP = "/", PATH_SEGMENT_TYPE, ";", PATH_SEGMENT_ID, { ";", FILTER } ;
WELL_KNOWN_REL = "contains" | "defines" | "incorporates" | "isParentOf" ;
ANY_REL = ANY ;
DIR_FILTER = ";", ( "in" | "out" | "both" )
REL_FILTER = ( "propertyName" | "propertyValue"), "=", ANY ;
REL_STEP = "/rl;", ( WELL_KNOWN_REL | ANY_REL ), [ DIR_FILTER ], { REL_FILTER
} ;
FILTER_STEP = FILTER, { ";", FILTER } ;
RETURN_TYPE = "" | "treeHash" ;
PATH_END = ( PATH_STEP | FILTER_STEP ), RETURN_TYPE ;
URI = { ( PATH_STEP | FILTER_STEP ), [ REL_STEP ] }, [ PATH_END ];
The "identical" bit is currently not present in the REST API but is in the
Java API. What it'd do here is that it would "widen" the start of the query
from the one entity specified by the CP to all entities that are identical to
it according to the identity hash rules (same id, same significant structure).
This is useful for scenarios like "querying all EAPs". The way this would work
is that you'd have your resource type that you expect defined and a global
level, possibly contained in a metadata pack. You'd then look for resources
that are defined by the resource types identical to yours. Because feeds are
free to (re-)define their resource types, this would match resources from
feeds that have types identical to the global one. Note that there is no
special relationship needed between the types - inventory figures this out
automagically. This way we loosen the requirement for synchronizing the
updates to the types defined by feeds and the user at the cost of "eventually
consistent behavior" once the parties upgrade at their own pace.
=== Examples
==== Return a tenant
/
==== Access Entity By Canonical Path
/t;tenant_id/f;feed_id/rt;resourceType_id
This is actually equivalent to:
/t;tenant_id/rl;contains;out/f;feed_id/rl;contains;out/rt;resourceType_id
which is no longer a canonical path but showcases how we declare "hops" over
specific relationships. "rl;contains;out" translates to "relationship with
name contains in the outgoing direction" and is implicit, if no other "hop"
between entities is specified.
To return the tree hash of the entity instead of the entity itself, one can:
/f;feed_id/r;resource/treeHash
Note that the tenant in the path is optional because it can be deduced from
the authorization details.
==== Accessing Targets of Relationships
/f;feed_id/r;resource_id/rl;incorporates/type=metric
This is equivalent to the current
`/feeds/feed_id/resources/resource_id/metrics`.
/f;feed_id/r;resource_id/rl;isParentOf/
(notice the trailing slash)
"give me all children of resource with id 'resource_id'".
==== Accessing Relationships
/f;feed_id/rl;contains
(notice the lack of trailing slash)
"find all the 'contains' relationships going out of the feed with id
'feed_id'."
To access a single relationship with known id:
/rl;relationship_id
==== More Complex Example
/f;feed_id/type=rt;name=My%20EAP;identical/rl;defines/type=resource/rl;isParentOf/type=resource?recursive=true
"get a feed with id 'feed_id' and find all resource types called "My EAP" that
it contains and all other resource types that are identical to it (them). Then
find all the resources that those resource types define and find all the
resources (recursively) that those resources are parents of."
=== Query Parameters
==== Paging `per_page`, `page`, `sort`, `order`
Paging is very expensive, because it implies fully iterating through the
result set (to be able to sort or determine the total). We may think about
some kind of server-side "live result set" that would hold the results on the
server and be accessed using some kind of token (with a TTL). This is how
neo4j does it and would avoid having to fetch the full result set on each
paged request.
==== `recursive`
This causes the last hop (relationship + entity filter) to be recursively
searched and added to the results. Tinkerpop defines a more generic concept of
"loop" using a label as a marker of the start of the "recursive hop" but I
don't think we need to be that powerful in a REST interface. Advanced users
may want to use the `query` endpoint with the full power of Gremlin query.
== POST URLs
URI = "/", CP, "/", "resource" | "metric" | "resourceType" | ...;
The idea here is that you can create the entities only at their canonical
positions. While the Java API allows for creation after a non-canonical
traversal, I think this would be unnecessarily complicated for the REST API.
The users would pass the familiar blueprint objects to these endpoints as they
do currently.
Examples:
/feed - send a feed blueprint and this will create a new feed
/resourceType - send a resource type blueprint and it will create a new global
resource type
/metricType
...
/f;feed/resourceType - creates a feed-local resource type
== PUT URLs
URI = "/", CP
just send an update object corresponding to the type of entity on the path
== DELETE URLs
URI = "/", CP
deletes the entity on the path
disassociation needs to be 2 steps - find the relationship in question and
then
delete the relationship, e.g.:
GET /f;feed_id/r;resource/rl;defines
DELETE /rl;id-found-in-the-results-from-the-previous-query
== Advanced Querying
URI = "/", "query"
free form, read-only gremlin query for more complex queries (this needs
to wait for the port to Tinkerpop3)
7 years, 11 months
H-metrics perf test results - Optimal message size
by Filip Brychta
Hello,
I tried to find optimal message size for POST requests which will result into the highest total throughput.
I tried 1, 100, 500 and 5000 metrics per request.
Test:
- h-metrics VM: 4 CPU cores, 8GB memory (gradually increasing heap size for WF in each test run - 1GB, 2GB, 4GB, 6GB, 8GB)
- cassandra node VM: 2 CPU cores, 4GB memory
Summary:
- small messages (1 metrics/request) were the worst for all test runs -> total throughput ~ 4500 metrics/sec
- medium messages (100 metrics/request) were performing the best almost for all test runs -> total throughput ~ 8600 -- 9000 metrics/sec
- bigger messages (500 and 5000 metrics/request) were significantly worst than medium messages when having low heap size (1GB)
- bigger messages were worst than medium even for bigger heap sizes but it was getting close
- big messages (5000 metrics/request) overloaded h-metrics when having only 1GB heap -> Full GC all the time
- when adding additional cassandra node and having sufficient heap size the bigger messages were performing better -> with sufficient heap size (4GB) and cassandra cluster the bigger messages the better performance (measured only up to 5000, I'm not sure if it make sense to try even bigger messages)
Details:
4 CPU cores
1GB for WF
msg size=1
4400 req/sec -> total throughput = 4400
avg-cpu: %user %nice %system %iowait %steal %idle
68.47 0.00 12.39 0.00 0.67 18.46
msg size=100
87 req/sec -> total throughput = 8700
avg-cpu: %user %nice %system %iowait %steal %idle
24.94 0.00 11.15 0.05 0.28 63.58
msg size=500
13 req/sec -> total throughput = 6500
avg-cpu: %user %nice %system %iowait %steal %idle
41.80 0.00 7.19 0.08 0.15 50.78
msg size=5000
0.879 req/sec -> total throughput = 4395
- GC all the time
avg-cpu: %user %nice %system %iowait %steal %idle
66.35 0.00 4.08 0.00 0.08 29.50
===================================
2GB for WF
msg size=1
4574 req/sec -> total throughput = 4574
avg-cpu: %user %nice %system %iowait %steal %idle
67.84 0.00 12.67 0.05 0.87 18.57
msg size=100
87 req/sec -> total throughput = 8700
- GC each 5-6 minutes
avg-cpu: %user %nice %system %iowait %steal %idle
26.99 0.00 10.25 0.13 0.21 62.43
msg size=500
16 req/sec -> total throughput = 8000
- GC each 1 min
avg-cpu: %user %nice %system %iowait %steal %idle
28.48 0.00 8.22 0.15 0.59 62.55
msg size=5000
1.432 req/sec -> total throughput = 7160
- GC each 30s
avg-cpu: %user %nice %system %iowait %steal %idle
36.44 0.00 10.04 0.18 0.08 53.27
===================================
4GB for WF
msg size=1
4498 req/sec -> total throughput = 4498
avg-cpu: %user %nice %system %iowait %steal %idle
63.28 0.00 13.03 0.03 0.58 23.10
msg size=100
90.79 req/sec -> total throughput = 9079
- GC after > 6 minutes
avg-cpu: %user %nice %system %iowait %steal %idle
25.86 0.00 10.44 0.08 0.26 63.37
msg size=500
16.07 req/sec -> total throughput = 8035
- GC after > 6 minutes
avg-cpu: %user %nice %system %iowait %steal %idle
23.73 0.00 8.18 0.13 0.46 67.50
msg size=5000
1.545 req/sec -> total throughput = 7725
- GC after > 6 minutes
avg-cpu: %user %nice %system %iowait %steal %idle
31.35 0.00 10.20 0.05 0.16 58.24
===================================
6GB for WF
msg size=1
4556 req/sec -> total throughput = 4556
avg-cpu: %user %nice %system %iowait %steal %idle
67.59 0.00 12.24 0.15 0.50 19.52
msg size=100
86 req/sec -> total throughput = 8600
avg-cpu: %user %nice %system %iowait %steal %idle
27.40 0.00 10.36 0.13 0.28 61.83
msg size=500
18 req/sec -> total throughput = 9000
avg-cpu: %user %nice %system %iowait %steal %idle
29.30 0.00 8.49 0.18 0.26 61.77
msg size=5000
1.45 req/sec -> total throughput = 7250
avg-cpu: %user %nice %system %iowait %steal %idle
32.26 0.00 11.61 0.34 0.10 55.69
===================================
8GB for WF
msg size=1
4718 req/sec -> total throughput = 4718
avg-cpu: %user %nice %system %iowait %steal %idle
44.19 0.00 17.05 0.33 0.22 38.22
msg size=100
82.27 req/sec -> total throughput = 8227
avg-cpu: %user %nice %system %iowait %steal %idle
30.77 0.00 10.95 0.08 0.13 58.07
msg size=500
17.86 req/sec -> total throughput = 7725
avg-cpu: %user %nice %system %iowait %steal %idle
52.56 20.32 11.16 0.20 0.00 15.76
msg size=5000
failed
===================================
2GB for WF + 2 cassandra nodes
msg size=1
total throughput = 6741
msg size=100
total throughput = 12900
msg size=500
total throughput = 11250
msg size=5000
total throughput = 13100
===================================
4GB for WF + 2 cassandra nodes
msg size=1
total throughput = 6533
msg size=100
total throughput = 12280
msg size=500
total throughput = 12650
msg size=5000
total throughput = 15950
Filip
7 years, 11 months