Application metrics from Jaeger clients
by Gary Brown
Hi Metrics Experts!
Re: https://github.com/uber/jaeger-client-java/issues/172#issuecomment-299723621
This issue is concerned with supporting Prometheus endpoints within the Jaeger instrumented client applications, to capture Jaeger related metrics (e.g. number of spans reported/sampled/dropped, etc) but also application metrics - i.e. number of requests, errors and latency(duration) for different services/operations (endpoints).
As we will be interested in capturing and analysing these …
[View More]metrics within Hawkular Metrics, would be good if someone with relevant experience could get involved in the discussion to ensure the metrics are reported in the most appropriate way.
For example - is it a good idea to have generic metrics names (jaeger-rpc.requests - which I assume is a counter, and jaeger-rpc.latency), or a metric name per endpoint - e.g. I was thinking service.operation.direction?
Based on the referenced comment, I'm not sure how the tags would relate to the metric names - I thought the tags needed to be constant for a particular metric name, but it might be my misunderstanding of what they are proposing.
Would be good to discuss - possibly here first and then when better understood make a proposal on the github issue.
Regards
Gary
[View Less]
7 years, 10 months
RfC: Layered Hawkular-services vs packaged one
by Heiko W.Rupp
Hey,
right now when we deploy Hawkular-services (H-S) on OpenShift, we run
into a situation where the user may already have deployed
Hawkular-Metrics (HAM) in OpenShift[1] and thus by deploying H-S will
end up in a situation where HAM is deployed twice - once for the
platform and once for H-S.
One solution could be to 'just' deploy H-S in OpenShift instead of HAM,
but this has some drawbacks
- larger deployment
- inclusion of parts that are not needed in 'classic' OpenShift
- different …
[View More]security model for OS-HAM than for H-Services
Another option could be to logically split and layer H-S:
- Have a H-S container (H-S-2) , that does not contain HAM
- This container would provide everything of H-S without HAM
- Calls to H-S-2 HAM are forwarded to OS-HAM
Of course there is no such thing as a free lunch:
- need to reserve the 'hawkular' tenant in OS-Metrics
- OS-Metrics has a different security concept
H-S-2 could act as a proxy that receives calls to HAM from agents
and clients, but 'translates' credentials and then forwards the calls to
OS-HAM
Does the above idea make any sense?
I am sure I am missing a ton of items in the above list
Heiko
[1] (e.g. with oc cluster up --metrics=true)
[View Less]
7 years, 11 months
Some food for thought about improving the release of (large) features
by Heiko W.Rupp
Hey,
some of us just had a meeting to recapture parts of the switch from
Inventory.v2 to .v3, where things went less easy (on the java side) than
I expected.
We identified a few areas where we could improve:
- Timeouts. Some tests were failing on local machines but not on travis
(and we had seen that in the direction in the past as well). We need to
be better at not assuming timing, as we can't know timing in the target
environments as well.
Similarly the test against live server was …
[View More]waiting 500*a few seconds
until inventory(.old) came up. Some waiting is good, but the question is
if e.g. inventory does not come up after some reasonable time, if we
should not abort the test as this may show real issues.
- Test reliability (the above is part of this). We need to try to have
more unit and also integration tests and make them more reliable. During
the merge we saw test failures on developer machines while Travis was
good. It turned out that this was due to timing. In the (RHQ) past we
saw test failures because of test ordering. We should perhaps try to
make our (integration) tests in random order on purpose, as in reality,
the user will not run the code in the order we assume in tests either
(yes, that may make setup and tear-down more complex).
- Making tests more end-to-end. Right now we have no idea (from the java
side) about the consequences of e.g. renaming a resource in the agent to
the display of this resource in ManageIQ. Luckily we already have the
ruby-gem tests that run against the live server. Perhaps we can extend
this somehow into MiQ test suite, so that this also tests against latest
hawkular-services master. Or record some interactions of MiQ with
H-services via the gem and have those interactions be re-played against
the live server (there will be a need for placeholders, but that is
something that cassettes already support)
- Way of working for such all-over changes: We were talking that in this
case it could be good to do that in a series of feature branches which
can use src-deps so that the feature branches all applied give the
desired new state. And only if all that is good, send pull-requests and
apply them to merge the full stream of work into master and get releases
of the components out.
[View Less]
7 years, 11 months