[Hawkular-dev] C* schema design

John Sanda jsanda at redhat.com
Thu Jun 18 12:29:50 EDT 2015


I had been looking at the alerts schema briefly and wanted to share some thoughts on schema design in general. My understanding is that there can be many alerts for a single trigger. And based on my understanding and experience with RHQ, it is possible for there to be a large number of alerts.  Here is the schema for the alerts table,

CREATE TABLE alerts (
    tenantid text,
    alertid text,
    payload text,
    PRIMARY KEY (tenantid, alertid)
)

Simple enough. Here is a short test driver[1] I wrote to insert 100 million alerts into a 3 node cluster. Note that only a single tenant is used. As the test ran, I periodically ran nodetool status to check the load on the cluster. You can see the output of that here[2]. You will that only of the nodes owns virtually all of load. 127.0.0.1 has over 2 GB of data while the other two have less than 500 KB each.

All of the alerts for a single tenant are stored in a single partition. In other words, all alerts are stored in one physical row on disk. This does not scale. Regardless of the number of nodes, only one node owns all of the data for a given tenant. This will also probably increasingly negative impact on read performance. 

The first change I might consider is to partition by trigger, but that alone is probably not sufficient. What if a single trigger generates thousands of alerts per day? That is still going to lead to hot spots (i.e., partition becoming too large). I would also consider partitioning by some time slice. Something like this,

CREATE TABLE alerts (
    tenantid text,
    triggerid text,
    date timestamp,
    alertid text,
    payload text,
    PRIMARY KEY ((tenantid, triggerid, date), alertid)
)

Now data is partitioned by tenant, trigger, and date. The date column might be rounded down to week, day, hour, etc. It really depends on the frequency of writes as well as the read patterns. Now the data will be spread across nodes in a more scalable way. And when you have to fetch data across multiple time slices, you can execute the queries in parallel.

[1] https://gist.github.com/jsanda/ce07905fa9f16f13c661 <https://gist.github.com/jsanda/ce07905fa9f16f13c661>
[2] https://gist.github.com/jsanda/6d2ee6cd5bcab07331d2 <https://gist.github.com/jsanda/6d2ee6cd5bcab07331d2>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/hawkular-dev/attachments/20150618/899dd44a/attachment.html 


More information about the hawkular-dev mailing list