Greetings,

I'm looking for a distributed time-series database, preferably backed by Cassandra, to help monitor about 30 instances in AWS (with a perspective of quick growth in the future). Hawkular Metrics seems interesting due to it's native clustering support and use of compression, since naively using Cassandra is quite inefficient - KairosDB seems to need about 12B/sample [1], which is *way* higher than other systems with custom storage backends (Prometheus can do ~1B/sample [2]).

I would like to know if there are any existing benchmarks for how Hawkular's ingestion and compression perform, and what kind of resources I would need to handle something like 100 samples/producer/second, hopefully with retention for 7 and 30 days (the latter with reduced precision).

My planned setup is Collectd -> Riemann -> Hawkular (?) with Grafana for visualization.

Thanks in advance,
Daniel