On Thu, 25 Feb 2010 12:02:34 +0100, philippe van dyck <pvdyck(a)gmail.com>
wrote:
Hi All,
Currently, I compress all data before sending it to the cache. Once
compressed, I gain 95% of the JSonized qi4j objects.
I did some profiling during the load tests and compression is taking
roughly 80% of the cpu time.
So I would like to compress only the data sent to the store, not in
memory.
Looks like the Marshaller is my friend here, and I plan to write a
compressing wrapper around it.
Now, when I look at it, I see two ways to wrap the compression process.
One way is with the ObjectInput / ObjectOutput but I am bothered by the
reentrant flag.
As a side note, the reentrant flag is used to signal the marshaller
whether several ObjectOutput/ObjectInput as open without a close, i.e.
--
marshaller.startObjectOutput(x, false)
marshaller.startObjectOutput(x, true) -> is reentrant, so mark it as such
--
marshaller.startObjectOutput(x, false)
marshaller.finishObjectOutput()
marshaller.startObjectOutput(x, false) -> not reentrant
marshaller.finishObjectOutput()
--
Why do we use this? To enable marshaller implementations to return a
different ObjectOutput if the call is reentrant. If you look at
org.infinispan.marshall.jboss.JBossMarshaller you see that the
ObjectOutput (or org.jboss.marshalling.Marshaller) is a ThreadLocal, but
JBossMarshaller does not allow for the same
org.jboss.marshalling.Marshaller to be opened twice. So, by using the
reentrant flag, we can make sure that the 2nd time that startObjectOutput
is called, a different one is provided.
For an example of reentrancy, see the javadoc:
* <p>On the other hand, when a call is reentrant, i.e.
startObjectOutput/startObjectOutput(reentrant)...finishObjectOutput/finishObjectOutput,
* the Marshaller implementation might treat it differently. An example
of reentrancy would be marshalling of {@link MarshalledValue}.
* When sending or storing a MarshalledValue, a call to
startObjectOutput() would occur so that the stream is open and
* following, a 2nd call could occur so that MarshalledValue's raw byte
array version is calculated and sent accross.
* This enables lazy deserialization on the receiver side which is
performance gain. The Marshaller implementation could decide
* that it needs a separate ObjectOutput or similar for the 2nd call
since it's aim is only to get the raw byte array version
* and the close finish with it.</p>
The second reentrant call is the one to create the MarshalledValue form of
the in memory data. The first call would be the stream opened to send the
put or get or whichever op you're sending around.
As a side note, using ThreadLocal is a much cleaner solution to having to
maintain a pool of org.jboss.marshalling.Marshaller instances.
Hope this clarifies further what the reentrant stuff does.
Cheers,
The other is the ByteBuffer stuff, no concurrency problem here, but
it
looks like more work.
WDYT ?
Cheers,
phil
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev
--
Galder ZamarreƱo
Sr. Software Engineer
Infinispan, JBoss Cache