[infinispan-dev] Compressing Marshaller Wrapper
Galder Zamarreno
galder at redhat.com
Fri Feb 26 05:16:33 EST 2010
On Thu, 25 Feb 2010 12:02:34 +0100, philippe van dyck <pvdyck at gmail.com>
wrote:
> Hi All,
>
> Currently, I compress all data before sending it to the cache. Once
> compressed, I gain 95% of the JSonized qi4j objects.
>
> I did some profiling during the load tests and compression is taking
> roughly 80% of the cpu time.
> So I would like to compress only the data sent to the store, not in
> memory.
>
> Looks like the Marshaller is my friend here, and I plan to write a
> compressing wrapper around it.
>
> Now, when I look at it, I see two ways to wrap the compression process.
>
> One way is with the ObjectInput / ObjectOutput but I am bothered by the
> reentrant flag.
As a side note, the reentrant flag is used to signal the marshaller
whether several ObjectOutput/ObjectInput as open without a close, i.e.
--
marshaller.startObjectOutput(x, false)
marshaller.startObjectOutput(x, true) -> is reentrant, so mark it as such
--
marshaller.startObjectOutput(x, false)
marshaller.finishObjectOutput()
marshaller.startObjectOutput(x, false) -> not reentrant
marshaller.finishObjectOutput()
--
Why do we use this? To enable marshaller implementations to return a
different ObjectOutput if the call is reentrant. If you look at
org.infinispan.marshall.jboss.JBossMarshaller you see that the
ObjectOutput (or org.jboss.marshalling.Marshaller) is a ThreadLocal, but
JBossMarshaller does not allow for the same
org.jboss.marshalling.Marshaller to be opened twice. So, by using the
reentrant flag, we can make sure that the 2nd time that startObjectOutput
is called, a different one is provided.
For an example of reentrancy, see the javadoc:
* <p>On the other hand, when a call is reentrant, i.e.
startObjectOutput/startObjectOutput(reentrant)...finishObjectOutput/finishObjectOutput,
* the Marshaller implementation might treat it differently. An example
of reentrancy would be marshalling of {@link MarshalledValue}.
* When sending or storing a MarshalledValue, a call to
startObjectOutput() would occur so that the stream is open and
* following, a 2nd call could occur so that MarshalledValue's raw byte
array version is calculated and sent accross.
* This enables lazy deserialization on the receiver side which is
performance gain. The Marshaller implementation could decide
* that it needs a separate ObjectOutput or similar for the 2nd call
since it's aim is only to get the raw byte array version
* and the close finish with it.</p>
The second reentrant call is the one to create the MarshalledValue form of
the in memory data. The first call would be the stream opened to send the
put or get or whichever op you're sending around.
As a side note, using ThreadLocal is a much cleaner solution to having to
maintain a pool of org.jboss.marshalling.Marshaller instances.
Hope this clarifies further what the reentrant stuff does.
Cheers,
> The other is the ByteBuffer stuff, no concurrency problem here, but it
> looks like more work.
>
> WDYT ?
>
> Cheers,
>
> phil
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
More information about the infinispan-dev
mailing list