The way we handle upgrades from older versions (version X) to newer ones
that have breaking schema changes (version Y) is as follows:
1) export all data from version X - the result will be a (potentially
large) JSON file - "version X" will be included in the meta-data for
this export file)
2) install version Y (new instances, new+empty DB instance)
3) import the file from step #1 into version Y
4) profit!
The important point here is that step #3 will automatically detect that
it is importing an older file and will "upgrade" the data to the latest
version, if necessary.
This approach allows us to support the following:
- Upgrade from arbitrary older version of apiman to an arbitrary newer
version
- Upgrade apiman from one storage mechanism to another (e.g. switch from
mysql to postgresql, or even switch from mysql to elasticsearch)
- Automatic, periodic backup of all relevant data for later restoration
It's worth noting that metrics information is currently not included in
this process - we'll need a solution to that in the future. The current
assumption is that the metrics data is stored in elasticsearch in a
format that will not change soon...
Of course, this isn't a good story for a cluster of nodes that require
rolling updates. Clearly to support that we would need to prevent any
breaking DB schema changes so...
...to answer your other question - I don't anticipate any *breaking*
changes to the schema in the foreseeable future. But as you say, I
can't guarantee it. :) I would expect any schema changes to be
additive (new tables or columns).
That said, we will not likely be providing DDL patch scripts in
community - so any in-place DB upgrades will be more manual than you
might like (e.g. diff the full DDLs to create your own patch script).
-Eric
On 6/7/2016 1:44 PM, Christopher Stolte wrote:
Hi Eric,
Thanks for your response. In the event of a schema change, would your
team be providing anything in the way of migration/validation scripts or
tools? I assume it would be mentioned in the release notes, right? I
know you can't predict the future, but could you characterize the
frequency of such changes at this point? (i.e. known schema changes
coming soon and often / probably something in the next 6 months /
unlikely to rare...)
Thanks,
Chris
------------------------------------------------------------------------
*From:* Eric Wittmann <eric.wittmann(a)redhat.com>
*Sent:* Monday, June 6, 2016 3:02:48 PM
*To:* Christopher Stolte; apiman-user(a)lists.jboss.org
*Subject:* Re: [Apiman-user] APIMAN, clustering, and versioning
Hi Chris. Thanks for the question.
Some responses below:
On 6/3/2016 1:02 PM, Christopher Stolte wrote:
> Our team is hoping to use Apiman in a production environment that
> includes a load-balanced cluster of Apiman gateway instances. Those
> instances would share Elasticsearch storage for metrics etc., as per
> this architecture summary:
Note that if you simply have multiple instances of the gateway running,
all pointed to the same ES storage, you do not need those gateway
instances to be "clustered". It's a subtle point, but it might be
important. Note that this point may change in the future as we create
better implementations of certain components (like rate limiting).
> As we imagine this scenario, some questions pop up that we don't have
> answers to. For example, what assumptions can we make about data
> integrity when Apiman releases a new version? Are major releases the
> only ones that might break a schema (and thus not coexist happily in a
> cluster with other instances of a different version) ? I apologize if
> this is documented somewhere - I took a look around but didn't find
> anything related to versioning semantics. Basically we'd like to
> understand the upgrade path for a given instance within a cluster.
This is a great question, and something that isn't yet very mature
within apiman. The short answer is that we don't make any guarantees
(currently) in community with regard to when the schema may change.
Rolling upgrades is not yet something we've explicitly planned for. The
current (easy) approach is to stand up a new environment for the new
version of apiman, then migrate your data from old to new. Once done,
you can do a cutover. If you have a large number of instances this may
not be possible.
> Apiman team: what have you tested or consciously targeted as far as
> clustered environments? Has anyone out there in the community tried to
> use a cluster of gateways? Has there been any work done related to this
> ticket?
We have just begun our internal clustering testing, so we be getting
this sort of experience very soon. We'll be testing for HA and
scalability first and foremost. That said, the software is designed to
easily run in a "cluster" (quotes used because as previously mentioned,
the nodes do not actually need to be running in a proper cluster).
-Eric