On Tue, Mar 14, 2017 at 1:08 PM, Sanne Grinovero <sanne@infinispan.org> wrote:
Just throwing out an idea I just had while thinking of Hibernate OGM
user needs for data migration.

For people using databases & related frameworks, it's common to have a
staging database which contains not just the staging "schema" but also
data. When legally possible, it's often preferable to have snapshot
from production data.

For example last time I worked with a SQL database, each week I'd take
a production backup and restore it on both our staging environment and
on the developer's instances so that everyone could run tests on it -
without needing access to the real production.

Interestingly, while we don't have an easy tool get a fully consistent
snapshot from a live Infinispan grid, "replicating" should be a
familiar concept here?

We do have a way to "copy" data from a grid, which is part of the Rolling Upgrade process [1], which is
supposed to migrate data from one cluster to another with no downtime for clients. Both clusters
are independent and can have incompatible Infinispan version, like 8.x and 9.x.

Re-using some parts of the Rolling Upgrade from [1], it'd be possible to extract a tool that simply grabs data from a
running cluster and save it to another without the clients accessing the source cluster being aware of it. Although
not technically a snapshot, it fits the use case of pre-production environment provisioning.

[1] http://infinispan.org/docs/dev/user_guide/user_guide.html#rolling_upgrades_for_infinispan_servers
 

Infinispan could have a "mitosis" feature, like cell reproduction, in
which the user connects a pristine grid instance of N nodes and these
N nodes automatically become non-primary owners for the full set of
segments - this could happen via a custom hash which makes them all
backup replicas (no main owners) and then Infinispan would be able to
tell when state transfer is completed, and initiate some some
coordination to sever the link without triggering re-hash on the
original "production" grid.

Incidentally, while this happens the child datagrid would be kept up
to date with in-flight changes, so we could envision either a very
short lock on changes to guarantee a fully consistent snapshot, or not
have any lock at all but minimise the inconsistencies to those which
might happen during the link decoupling.

Maybe this would satisfy also the people who've been asking for the
snapshot feature? I don't think people want a snapshot for the sake of
it, but to replicate the grid..

I realise it's not a 1 day of work and the idea is not fully fleshed
out, but I think this would be a very well received feature.


Thanks,
Sanne
_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev