Greetings,
I'm looking for a distributed time-series database, preferably backed by
Cassandra, to help monitor about 30 instances in AWS (with a perspective of
quick growth in the future). Hawkular Metrics seems interesting due to it's
native clustering support and use of compression, since naively using
Cassandra is quite inefficient - KairosDB seems to need about 12B/sample
[1], which is *way* higher than other systems with custom storage backends
(Prometheus can do ~1B/sample [2]).
I would like to know if there are any existing benchmarks for how
Hawkular's ingestion and compression perform, and what kind of resources I
would need to handle something like 100 samples/producer/second, hopefully
with retention for 7 and 30 days (the latter with reduced precision).
My planned setup is Collectd -> Riemann -> Hawkular (?) with Grafana for
visualization.
Thanks in advance,
Daniel