[
https://issues.jboss.org/browse/HAWKULARQE-81?page=com.atlassian.jira.plu...
]
viet nguyen edited comment on HAWKULARQE-81 at 4/13/17 12:16 PM:
-----------------------------------------------------------------
*Test setup params:*
* 30 metrics per pod
* 30s collection interval
* Disable heapster
* Defaults 2GB RAM for HMetrics, Cassandra (1node)
* Collect metrics for at least 4 hours. Use [
PyMe|https://github.com/vnugent/pyme] to
gather Metric IDs and raw data points.
*Expected results:*
# 30 raw data points per Metric ID per 15 minute duration
# 30 Metric IDs /pod * 30 = 900 raw data points per pod per 15min duration (rdp/pod)
# 900 rdp/pod * number of pod = grand total raw data points per 15 min duration
| number of pods| grand total rdp/15min (direct query)|rdp/minute (calculated)|
| 50 | 45,000 |3,000|
|100|90,000|6,000|
|150|135,000|9,000|
|230|207,000|13,800|
*Actual results:*
* 50 pods [^50pods.png]
* 100 pods [^100pods.png]
* 150 pods [^150-2.png]
* 230 pods - Unrecoverable OOM exception in Cassandra when PyMe ran.
[
BZ1435436|https://bugzilla.redhat.com/show_bug.cgi?id=1435436]
*Observations*:
* Up to 150 pods- all raw metrics from HOSA were correctly captured and stored in
Cassandra
* Starting at 100 pods - metric definition query would fail due to timeout exception (HTTP
409). For the query to work all pods must be stopped, ie no new metrics are generate from
HOSA.
* At 230 pods metric definition query triggered OOM in Cassandra
* number of raw metrics / minute is substantially less than other internal benchmarks.
was (Author: vietn):
*Test setup params:*
* 30 metrics per pod
* 30s collection interval
* Disable heapster
* Defaults 2GB RAM for HMetrics, Cassandra (1node)
* Collect metrics for at least 4 hours. Use [
PyMe|https://github.com/vnugent/pyme] to
gather Metric IDs and raw data points.
*Expected results:*
# 30 raw data points per Metric ID per 15 minute duration
# 30 Metric IDs /pod * 30 = 900 raw data points per pod per 15min duration (rdp/pod)
# 900 rdp/pod * number of pod = grand total raw data points per 15 min duration
| number of pods| grand total rdp/15min (direct quer)|rdp/minute (calculated)|
| 50 | 45,000 |3,000|
|100|90,000|6,000|
|150|135,000|9,000|
|230|207,000|13,800|
*Actual results:*
* 50 pods [^50pods.png]
* 100 pods [^100pods.png]
* 150 pods [^150-2.png]
* 230 pods - Unrecoverable OOM exception in Cassandra when PyMe ran.
[
BZ1435436|https://bugzilla.redhat.com/show_bug.cgi?id=1435436]
*Observations*:
* Up to 150 pods- all raw metrics from HOSA were correctly captured and stored in
Cassandra
* Starting at 100 pods - metric definition query would fail due to timeout exception (HTTP
409). For the query to work all pods must be stopped, ie no new metrics are generate from
HOSA.
* At 230 pods metric definition query triggered OOM in Cassandra
* number of raw metrics / minute is substantially less than other internal benchmarks.
Baseline #3
-----------
Key: HAWKULARQE-81
URL:
https://issues.jboss.org/browse/HAWKULARQE-81
Project: Hawkular QE
Issue Type: Sub-task
Reporter: viet nguyen
Assignee: viet nguyen
Attachments: 100pods.png, 150-2.png, 50pods.png, March28_0200_raw.zip,
promgen-analytic-12hr.png
* Run pyme on master to eliminate vpn slowness
* Fix query start-end window
* Update pyme endpoint to increase metrics to 30 (currently 2)
* insert metrics tallies into a separate Hawkular Metrics instance and use Grafana as a
visual tool
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)