[Hawkular-dev] Data store migration

Wed Jan 6 05:30:20 EST 2016

On this specific case, it's related to HAWKULAR-881 (Join org workflow 
without email): I'd need new "tables" and a new "column" in an existing 
table, possibly with a default value.

The details on this specific case aren't that important, as it's more of 
a general question. As far as I know, we don't have an uniform approach 
to this problem across the components.

- Juca.

On 05.01.2016 19:15, John Doyle wrote:
> Can we back up and describe what we're migrating from and to, and the
> event thats triggered the migration? Maybe that's implicit in the
> question for everyone else, not for me.
>
> thx
> ~jd
>
> On Mon, Jan 4, 2016 at 10:35 AM, John Sanda <jsanda at redhat.com
> <mailto:jsanda at redhat.com>> wrote:
>
>     On Jan 4, 2016, at 5:43 AM, Juraci Paixão Kröhling
>     <jpkroehling at redhat.com <mailto:jpkroehling at redhat.com>> wrote:
>     >
>     > Team,
>     >
>     > What's the recommended approach for handling data migrations? Is there a
>     > library similar to liquibase?
>     >
>     > - Juca.
>     >
>
>     Liquibase is designed specifically for the RDBMS. When RHQ started
>     moving to Cassandra, I started working on a patch for Liquibase to
>     add support for Cassandra. After some discussion on the liquibase
>     dev list, I eventually decided to abandon the effort because of the
>     amount of changes involved and because it became clearer that
>     liquibase was not a good fit because of it being very RDBMS-centric.
>     We decided to implement our own solution in RHQ to address our
>     immediate needs. It has been a while since I have looked to see what
>     other solutions might be out there. I have come across something for
>     Rails applications, and I think someone may have tried to add
>     support in Flyway.
>
>     There are some things that need to be taken into consideration. I
>     will briefly discuss some of those now.
>
>     * Should the migrations be done at installation/deployment time or
>     at runtime?
>     This is probably the most important consideration because everything
>     else in large part stems from it. Some changes like adding/removing
>     a column or adding/removing a table are fast and efficient in
>     Cassandra. I therefore think it is acceptable to do these types of
>     changes at deployment time. Other changes like adding data or
>     moving/transforming data that could be long running operations.
>     While it increases application code complexity, these changes should
>     be done at runtime generally speaking.
>
>     * How should migrations be implemented?
>     With the RDBMS, we can easily manipulate, transform, and move data
>     with SQL. That is not the case with CQL. We have to resort to
>     writing code on top of the driver to make the changes. In some
>     situations a better approach might be to generate new SSTables and
>     stream those into Cassandra. For larger data migrations is likely to
>     be a faster as you completely bypass the whole CQL layer.
>     Ultimately, I think both approaches need to be an option.
>
>     * Where should migration meta data be stored?
>     We need to keep track of migrations that have been applied. There
>     might be migrations that are specific to a particular environment,
>     e.g., dev vs prod. Since we are trying to avoid additional data
>     stores, I think it makes sense to store migration meta data in
>     Cassandra. Maybe we a migrations keyspace that tracks the migrations
>     for each of the hawkular keyspaces.
>
>
>
>     _______________________________________________
>     hawkular-dev mailing list
>     hawkular-dev at lists.jboss.org <mailto:hawkular-dev at lists.jboss.org>
>     https://lists.jboss.org/mailman/listinfo/hawkular-dev
>
>