Lukas,
That's excellent news. Multiple backends is not something we really
want to deal with. Also, it might be nice to see a short presentation
on the "best practices" for Tx handling. But then again, that Tx stuff
is handled at the Gremlin level? So, perhaps not relevant to direct C*
consumers like Alerts.
On 2/23/2016 12:43 PM, Lukas Krejci wrote:
Hi all,
lately I've become really dissatisfied with how Inventory performed and
semi-publicly blamed Titan for that (because that was what looked like
the cause of all world's problems in my then uneducated eyes ;) ).
I decided to do some performance comparisons. Because we didn't want
Hawkular to ship with 2 different NoSQL backends (C* for metrics and
whatever else for Inventory), I chose an RDBMS as a good conservative
alternative (because people, IMHO, are still more comfortable dealing
with an RDBMS than with NoSQL databases).
Currently, inventory is written against the graph DSL called Gremlin
(from Tinkerpop 2.6.0). Fortunately, there exists a "toy" SQL backend
for Tinkerpop 2 that we could try and see if it performed any good
(which would frankly be surprising, given the fact it stores the graph
data rather naively). With some luck, no code would have to be changed
on our side to see the results.
We had no such luck.
Making the inventory run with the SQL backend was literally a day worth
of work (if that) and the first preliminary tests showed that Inventory
with Postgres backend performed much much better that Titan with
embedded Cassandra. But the tests also uncovered some problems with the
way Inventory code handled transactions.
Fast forward 3 weeks and see large parts of Hawkular inventory updated
to correctly handle transactions. Now a single call to Inventory really
results in at most 1 transaction in the backend.
So, I went and re-ran the tests. Also, I refrained from using embedded
Cassandra and instead use a locally running 2-node cluster.
The results caught me by surprise. Not so much that the naive SQL
backend didn't perform particularly well, but the difference between the
performance of Titan before and after the transaction handling fixes.
To not keep you waiting any longer for the results: Titan + C* is the
winner.
For nice charts that include comparison to the old misbehaving impl, see:
https://dashboards.ly/ua-tALzrY9rEoRBXvsLXbZJHT
Cheers,