HI [~grdryn],
bq. I don't fully understand this one. Does the 45% , in this case , mean "45% of the entire time period", or "45% of the portion of the time period that there was at least 1 pod running"?
The % ( average) is based on the time/period tracked which appears in the graph as you can check in the image attached here. E.g if the graph is shown as a tracked period the time from 13:00 to 14:00 where the service was up for 30 minutes the " average uptime" will be 50%.
bq. There isn't really a good way that we can get an "average uptime" from the pod itself (maybe we could federate from the cluster metrics eventually).
Regards the image attached here it is checking the uptime of the operator pod. I did not get your point here since has a specific metric to check if the specific pod is running or not and do this average which has been used on it. So, it shows very accurately. See here: https://github.com/aerogear/mobile-developer-console-operator/blob/master/deploy/monitor/grafana_dashboard.yaml#L109.
For a better understand, check its docs:
- `avg (calculate the average over dimensions)`. ref. https://prometheus.io/docs/prometheus/latest/querying/operators/. - `up{job="<job-name>", instance="<instance-id>"}: 1 if the instance is healthy, i.e. reachable, or 0 if the scrape failed.` Ref.: https://prometheus.io/docs/concepts/jobs_instances/
bq. I just spoke to Wei about this, and what he suggests here is that we remove the "average" gauge altogether, and replace it with just an up/down indicator that shows the current status rather than an average.
IMHO we should keep it as it is because:
- Shows very important for SRE's knows for how long in average a service was down in the period. - It is using common operators/matrics of Prometheus and working as should be and I cannot see any reason for do we believe that the graph is not accurate at all.
So, please could you check the above information and let me know if you really wish to remove the % metric of these dashboards? If yes, if it is the final decision it is fine I just would like to be sure that all is clarified before start to o that.
c/c [~wei il] [~b1zzu]
|
|