]
Dan Berindei commented on ISPN-6906:
------------------------------------
[~galder.zamarreno] Slight correction: we never needed a separate marshaller for the key
and for the value. We only need a separate marshaller when we change the classloader,
because of the class/instance caching in RiverMarshaller. E.g. we start unmarshalling a
{{CacheRpcCommand}} with the global marshaller, but then {{CacheRpcCommandExternalizer}}
needs to switch to the cache marshaller once we know the cache name. Still, we use the
same underlying byte array, there's no copying going on.
Reduce dependency on JBoss Marshalling
--------------------------------------
Key: ISPN-6906
URL:
https://issues.jboss.org/browse/ISPN-6906
Project: Infinispan
Issue Type: Sub-task
Reporter: Galder Zamarreño
Assignee: Galder Zamarreño
Fix For: 9.0.0.Final
Since its inception Infinispan has been using JBoss Marshalling to deal with all the
marshalling needs. With some tweaking (e.g. hooking a custom ObjectTable instance), the
JBoss Marshalling based Infinispan externalizer layer is able to produce tiny binary
payloads but it has some problems partly due to JBoss Marshalling itself and partly due to
our own implementation details:
JBoss Marshalling's objective has always been to try to produce a binary format that
passes Java specification, but this is not a requirement for Infinispan. In fact, to
reduce the payload size, Infinispan hooks at the ObjectTable level to produce minimal
payload sizes.
On top of the mismatch problems mentioned above, JBoss Marshalling’s programming model is
based around creating a marshaller, writing to it, and then finishing using it by
discarding its context (same applies to unmarshalling). The problem here is two-fold:
* Both marshaller and unmarshaller are quite heavy objects, keeping context information
such as references to instances appearing multiple times...etc, so constantly creating
them is costly. So, to avoid wasting resources, we ended up adding thread locals that keep
a number of marshaller/unmarshaller instances per thread (see ISPN-1815). These thread
locals can potentially affect memory space (see user dev post).
* The second problem is the need to support reentrant marshalling calls when storing data
in binary format. The need for reentrancy appears in situations like this: Imagine you
have to marshall a PutKV command, so you start a marshaller and write some stuff. Then,
you have store the key and value, but these are binary so they have to be transformed into
binary format, so again a marshaller needs to be created and key/value information
written, finish with the marshaller and then write the bytes in the command itself. So,
there needs to be a way to start two marshallers without having finished the first one.
This is the reason why the changes added in ISPN-1815 resulted in the thread local keeping
a number of marshaller/unmarshaller instances rather than a single one.
Finally, for inter-node cluster communication and storing data in persistence layer,
Infinispan is using JBoss Marshalling for both marshalling the types it knows about, e.g.
internal data types, and types it does not know about, e.g. key and value types. This
means that even if the marshaller is configurable, it’s not easy to switch to a different
marshaller (see here for an example where we try to use a different marshaller). This
problem is not present in Hot Rod Java clients since there JBoss Marshalling is purely
used to marshall keys and values, so it’s very easy to test out a different marshaller.
With all this in mind, the following change recommendations can be made:
* For those types that we know about, marshall those manually in the most compact way
possible. JBoss Marshalling codebase does a lot of these for encoding basic types (e.g.
Strings, numbers)...etc, so we should be able to reuse them.
* Only rely on 3rd party marshalling libraries for types we don’t know about, e.g. key
and value types (If these key/value types happen to be primitives, or primitive
derivations (e.g. arrays), we should be able to optimise those too. So, you only rely on
3rd party marshalling libraries for custom unknown types.). The benefit here is the we
decouple Infinispan from using JBoss Marshalling all over the place, making it easier to
try different marshalling mechanisms.
* With JBoss Marshalling only used for unknown custom types, if the JBoss Marshalling
marshaller implementation wants to use thread locals, that's fine, but then we
effectively get rid of them except for custom types when JBoss Marshalling marshaller is
used, plus we can switch/try different 3rd party marshallers which might be better suited.