May 2017 - hawkular-dev - Jboss List Archives

by Gary Brown

Hi Metrics Experts! Re: https://github.com/uber/jaeger-client-java/issues/172#issuecomment-299723621 This issue is concerned with supporting Prometheus endpoints within the Jaeger instrumented client applications, to capture Jaeger related metrics (e.g. number of spans reported/sampled/dropped, etc) but also application metrics - i.e. number of requests, errors and latency(duration) for different services/operations (endpoints). As we will be interested in capturing and analysing these metrics within Hawkular Metrics, would be good if someone with relevant experience could get involved in the discussion to ensure the metrics are reported in the most appropriate way. For example - is it a good idea to have generic metrics names (jaeger-rpc.requests - which I assume is a counter, and jaeger-rpc.latency), or a metric name per endpoint - e.g. I was thinking service.operation.direction? Based on the referenced comment, I'm not sure how the tags would relate to the metric names - I thought the tags needed to be constant for a particular metric name, but it might be my misunderstanding of what they are proposing. Would be good to discuss - possibly here first and then when better understood make a proposal on the github issue. Regards Gary

8 years, 5 months

2
5
0 / 0

need commons release to fix MiQ problem

by John Mazzitelli

This MiQ BZ was submitted: https://bugzilla.redhat.com/show_bug.cgi?id=1447925 Looks like Hawkular does not support a couple of XA-related properties that MiQ wants to set. So I have this commons PR that needs to be reviewed and merged: https://github.com/hawkular/hawkular-commons/pull/101 Once merged, we need to release commons 0.9.5.Final. At that point, I can then update the agent pom and merge this agent PR: https://github.com/hawkular/hawkular-agent/pull/338

8 years, 5 months

1
0
0 / 0

RfC: Layered Hawkular-services vs packaged one

by Heiko W.Rupp

Hey, right now when we deploy Hawkular-services (H-S) on OpenShift, we run into a situation where the user may already have deployed Hawkular-Metrics (HAM) in OpenShift[1] and thus by deploying H-S will end up in a situation where HAM is deployed twice - once for the platform and once for H-S. One solution could be to 'just' deploy H-S in OpenShift instead of HAM, but this has some drawbacks - larger deployment - inclusion of parts that are not needed in 'classic' OpenShift - different security model for OS-HAM than for H-Services Another option could be to logically split and layer H-S: - Have a H-S container (H-S-2) , that does not contain HAM - This container would provide everything of H-S without HAM - Calls to H-S-2 HAM are forwarded to OS-HAM Of course there is no such thing as a free lunch: - need to reserve the 'hawkular' tenant in OS-Metrics - OS-Metrics has a different security concept H-S-2 could act as a proxy that receives calls to HAM from agents and clients, but 'translates' credentials and then forwards the calls to OS-HAM Does the above idea make any sense? I am sure I am missing a ton of items in the above list Heiko [1] (e.g. with oc cluster up --metrics=true)

8 years, 5 months

2
1
0 / 0

Some food for thought about improving the release of (large) features

by Heiko W.Rupp

Hey, some of us just had a meeting to recapture parts of the switch from Inventory.v2 to .v3, where things went less easy (on the java side) than I expected. We identified a few areas where we could improve: - Timeouts. Some tests were failing on local machines but not on travis (and we had seen that in the direction in the past as well). We need to be better at not assuming timing, as we can't know timing in the target environments as well. Similarly the test against live server was waiting 500*a few seconds until inventory(.old) came up. Some waiting is good, but the question is if e.g. inventory does not come up after some reasonable time, if we should not abort the test as this may show real issues. - Test reliability (the above is part of this). We need to try to have more unit and also integration tests and make them more reliable. During the merge we saw test failures on developer machines while Travis was good. It turned out that this was due to timing. In the (RHQ) past we saw test failures because of test ordering. We should perhaps try to make our (integration) tests in random order on purpose, as in reality, the user will not run the code in the order we assume in tests either (yes, that may make setup and tear-down more complex). - Making tests more end-to-end. Right now we have no idea (from the java side) about the consequences of e.g. renaming a resource in the agent to the display of this resource in ManageIQ. Luckily we already have the ruby-gem tests that run against the live server. Perhaps we can extend this somehow into MiQ test suite, so that this also tests against latest hawkular-services master. Or record some interactions of MiQ with H-services via the gem and have those interactions be re-played against the live server (there will be a need for placeholders, but that is something that cassettes already support) - Way of working for such all-over changes: We were talking that in this case it could be good to do that in a series of feature branches which can use src-deps so that the feature branches all applied give the desired new state. And only if all that is good, send pull-requests and apply them to merge the full stream of work into master and get releases of the components out.

8 years, 5 months

2
1
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

hawkular-dev May 2017