Thank you very much John, that is, indeed very helpful. It seems compression will be exactly what I'm looking for for long-term storage.
I did some tests with KairosDB in the meantime, and it seems it can sustain ~20K data points/s with a 3-node Cassandra cluster of t2.large AWS instances (no provisioned IOPS, just 20GB of standard EBS storage).
I'll to do some similar tests with Hawkular and report my findings.
Can you share the program that you used for the simulated testing, so that I can try a similar pattern with KairosDB?
Regards,
Daniel
The test lives in my repo in branch named generate-data[1]. The test is named GenerateDataITest.java[2].
* Generating raw/uncompressed data
Checkout the generate-data branch from my repo.
`mvn install -DskipTests -Dlicense.skip -Dcheckstyle.skip` (you only need to build a handful of modules, but this is easier since it reduces number of steps)
`cd core/metrics-core-service`
`mvn verify -Dit.test=GenerateDataITest`
This will generate 7 days of raw data for 5,000 metrics with a data point for every 10 seconds. It make take some time to finish. If the test encounters any errors like a write timeout, it will abort.
When the test finishes, run `nodetool drain`.
Measure the size of the <CASSANDRA_DATA_DIR>/hawkulartest/data-* directory.
* Generating compressed data
This will reuse the raw data generated from the previous steps.
Restart Cassandra (has to be restarted since you did a drain)
`mvn verify -Dit.test=GenerateDataITest -Dcompress`
When the test finishes, run `nodetool drain`.
Measure the size of the <CASSANDRA_DATA_DIR>/hawkulartest/data_compressed* directory.