Many of you have probably seen warnings in the Hawkular server log like,

WARN 15:55:59 Batch of prepared statements for [, hawkular_metrics.metrics_idx] is of size 5665, exceeding specified threshold of 5120 by 545.

This warning is generated due to batch statements being larger that a threshold defined in cassandra.yaml. It defaults to 5 KB. When the batch statement is larger than that threshold, Cassandra logs the warning. Note that the threshold is based on the actual size of the payload, not the number of statements in the batch. We should stop seeing this warning in 0.7.0 release of Metrics. See HWKMETRICS-252[1] for details.

The general advice in the Cassandra community is to favor async writes in parallel over batch inserts when you are trying to improve or optimize write performance. Unlogged batches across multiple partitions is almost always a bad idea. The one exception is with unlogged batches in which all of the mutations are for the same partition. In that case, Cassandra performs the writes atomically. This is how we use batch inserts in metrics. Interestingly I have seen threads on the Cassandra mailing list that still discourage the use of batch inserts even in this case. This thread[1] provides some really interesting insights and analysis on unlogged batch inserts vs async inserts. The thread references a document with some performance analysis that is worth a look.