Cache Store Marshalling
by Ryan Emerson
Hi All,
Currently the CacheWriterInterceptor utilises the internal marshaller for marshalling entries before they are sent to the configured cache stores. This causes several problems [1], most notably that changes to the internal marshaller make stored data incompatible across Infinispan versions.
I propose that we decouple the internal and store marshaller. To allow 9.x versions to remain compatible, we should default to the internal marshaller (until 10.x at least), but optionally allow users to specify a custom StreamingMarshaller implementation as part of their PersistenceConfiguration. As we already have the protostuff and kryo bridges, users would specify an enum for the marshaller they want to use as well as an optional class string if a custom implementation is required. So for example:
enum StoreMarshaller {
CUSTOM,
INTERNAL,
KRYO
PROTOSTUFF;
}
new ConfigurationBuilder()
.persistence()
.marshaller(StoreMarshaller.CUSTOM)
.marshallerClass("org.example.Marshaller")
...
Finally, Gustavo brought flatbuffers[2] to my attention which could be a good option to provide for users as one of the default StoreMarshaller implementations.
WDYT?
Cheers
Ryan
[1] https://docs.google.com/document/d/1PR0eYgjhqXUR5w03npS7TdOs2KDZjfdMh0au8...
[2] https://google.github.io/flatbuffers/
7 years, 9 months
Infinispan "mitosis"
by Sanne Grinovero
Just throwing out an idea I just had while thinking of Hibernate OGM
user needs for data migration.
For people using databases & related frameworks, it's common to have a
staging database which contains not just the staging "schema" but also
data. When legally possible, it's often preferable to have snapshot
from production data.
For example last time I worked with a SQL database, each week I'd take
a production backup and restore it on both our staging environment and
on the developer's instances so that everyone could run tests on it -
without needing access to the real production.
Interestingly, while we don't have an easy tool get a fully consistent
snapshot from a live Infinispan grid, "replicating" should be a
familiar concept here?
Infinispan could have a "mitosis" feature, like cell reproduction, in
which the user connects a pristine grid instance of N nodes and these
N nodes automatically become non-primary owners for the full set of
segments - this could happen via a custom hash which makes them all
backup replicas (no main owners) and then Infinispan would be able to
tell when state transfer is completed, and initiate some some
coordination to sever the link without triggering re-hash on the
original "production" grid.
Incidentally, while this happens the child datagrid would be kept up
to date with in-flight changes, so we could envision either a very
short lock on changes to guarantee a fully consistent snapshot, or not
have any lock at all but minimise the inconsistencies to those which
might happen during the link decoupling.
Maybe this would satisfy also the people who've been asking for the
snapshot feature? I don't think people want a snapshot for the sake of
it, but to replicate the grid..
I realise it's not a 1 day of work and the idea is not fully fleshed
out, but I think this would be a very well received feature.
Thanks,
Sanne
7 years, 9 months
Calling getCache with a template and defined configuration
by William Burns
When working on another project using Infinispan the code being used was a
bit interesting and I don't think our template configuration handling was
expecting it do so in such a way.
Essentially the code defined a template for a distributed cache as well as
some named caches. Then whenever a cache is retrieved it would pass the
given name and always the distributed cache template. Unfortunately with
the way templates work they essentially redefine a cache first so the
actual cache configuration was wiped out. In this example I was able to
get the code to change to using a default cache instead, which is the
behavior that is needed.
The issue though at hand is whether we should allow a user to call getCache
in such a way. My initial thought is to have it throw some sort of
configuration exception when this is invoked. But there are some possible
options.
1. Throw a configuration exception not allowing a user to use a template
with an already defined cache. This has a slight disconnect between
configuration and runtime, since if a user adds a new definition it could
cause runtime issues.
2. Log an error/warning message when this occurs. Is this enough though?
Still could have runtime issues that are possibly undetected.
3. Merge the configurations together applying the template first. This
would be akin to how default cache works currently, but you would get to
define your default template configuration at runtime. This sounded like
the best option to me, but the problem is what if someone calls getCache
using the same cache name but a different template. This could get hairy as
well.
Really thinking about the future, disconnecting the cache definition and
retrieval would be the best option, but we can't do that this late in the
game.
What do you guys think?
- Will
7 years, 10 months