[Hawkular-dev] [Inventory] Performance of Tinkerpop3 backends

Thu Aug 18 02:22:16 EDT 2016

So, the biggest issue with Titan is that the project seems dead (and
complex)
Introducing a different database is also a source of issues (and schema
changes in particular)

Do we have a 3rd option that keeps Cassandra for the inventory but doesn't
rely on Titan ? (and achievable in a reasonable timeframe)

Thomas

On Thu, Jul 21, 2016 at 2:08 PM, Lukas Krejci <lkrejci at redhat.com> wrote:

> Hi all,
>
> to move inventory forward, we need to port it to Tinkerpop3 - a new(ish)
> and
> actively maintained version of the Tinkerpop graph API.
>
> Apart from the huge improvement in the API expressiveness and capabilities,
> the important thing is that it comes with a variety of backends, 2 of which
> are of particular interest to us ATM. The Titan backend (with Titan in
> version
> 1.0) and SQL backend (using the sqlg library).
>
> The SQL backend is a much improved (yet still unfinished in terms of
> optimizations and some corner case features) version of the toy SQL backend
> for Tinkerpop2.
>
> Back in March I ran performance comparisons for SQL/postgres and Titan
> (0.5.4)
> on Tinkerpop2 and concluded that Titan was the best choice then.
>
> After completing a simplistic port of inventory to Tinkerpop3 (not taking
> advantage of any new features or opportunities to simplify inventory
> codebase), I've run the performance tests again for the 2 new backends -
> Titan
> 1.0 and Sqlg (on postgres).
>
> This time the results are not so clear as the last time.
> >From the charts [1] you can see that Postgres is actually quite a bit
> faster
> on reads and can better handle concurrent read access while Titan shines in
> writes (arguably thanks to Cassandra as its storage).
>
> Of course, I can imagine that the read performance advantage of Postgres
> would
> decrease with the growing amount of data stored (the tests ran with the
> inventory size of ~10k entities) but I am quite positive we'd get
> competitive
> read performance from both solutions up to the sizes of inventory we
> anticipate (100k-1M entities).
>
> Now the question is whether the insert performance is something we should
> be
> worried about in Postgres too much. IMHO, there should be some room for
> improvement in Sqlg and also our move to /sync for agent synchronization
> would
> make this less of a problem (because there would be not that many initial
> imports that would create vast amounts of entities).
>
> Nevertheless I currently cannot say who is the "winner" here. Each backend
> has
> its pros and cons:
>
> Titan:
> Pros:
> - high write throughput
> - backed by cassandra
>
> Cons:
> - slower reads
> - project virtually dead
> - complex codebase (self-made fixes unlikely)
>
> Sqlg:
> Pros:
> - small codebase
> - everybody knows SQL
> - faster reads
> - faster concurrent reads
>
> Cons:
> - slow writes
> - another backend needed (Postgres)
>
> Therefore my intention here is to go forward with a "proper" port to
> Tinkerpop3 with Titan still enabled but focus primarily on Sqlg to see if
> we
> can do anything with the write performance.
>
> IMHO, any choice we make is "workable" as it is even today but we need to
> weigh in the productization requirements. For those Sqlg with its small dep
> footprint and postgres backend seems preferable to the huge dependency
> mess of
> Titan.
>
> [1] https://dashboards.ly/ua-TtqrpCXcQ3fnjezP5phKhc
>
> --
> Lukas Krejci
> _______________________________________________
> hawkular-dev mailing list
> hawkular-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hawkular-dev
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/hawkular-dev/attachments/20160818/3e76ea0e/attachment.html