[infinispan-dev] Compressing Marshaller Wrapper
philippe van dyck
pvdyck at gmail.com
Fri Feb 26 06:36:57 EST 2010
Thanks for the reentrant scenario Galder.
https://jira.jboss.org/jira/browse/ISPN-357 is now closed.
If the Marshaller is used for something else than storing cache entries, I don't think it is a good idea to implement compression at this level.
Compression is cpu intensive, and it may be a good idea to "prepare" entries in memory (with a low priority thread), like adding a "compressed" flag to a cache entry.
This way, they are ready for storage or transfer... they consume less memory, but they cost much more to use (decompression time).
In fact, it is a very old tradeoff and IMO if compression should be integrated in Infinispan, it is at a higher level -- and another discussion.
>From my point of view, S3 entries are now compressed and cost less to transfer and store, it was my initial goal.
cheers,
phil
Le 26 févr. 2010 à 11:16, Galder Zamarreno a écrit :
> On Thu, 25 Feb 2010 12:02:34 +0100, philippe van dyck <pvdyck at gmail.com>
> wrote:
>
>> Hi All,
>>
>> Currently, I compress all data before sending it to the cache. Once
>> compressed, I gain 95% of the JSonized qi4j objects.
>>
>> I did some profiling during the load tests and compression is taking
>> roughly 80% of the cpu time.
>> So I would like to compress only the data sent to the store, not in
>> memory.
>>
>> Looks like the Marshaller is my friend here, and I plan to write a
>> compressing wrapper around it.
>>
>> Now, when I look at it, I see two ways to wrap the compression process.
>>
>> One way is with the ObjectInput / ObjectOutput but I am bothered by the
>> reentrant flag.
>
> As a side note, the reentrant flag is used to signal the marshaller
> whether several ObjectOutput/ObjectInput as open without a close, i.e.
> --
> marshaller.startObjectOutput(x, false)
> marshaller.startObjectOutput(x, true) -> is reentrant, so mark it as such
> --
> marshaller.startObjectOutput(x, false)
> marshaller.finishObjectOutput()
> marshaller.startObjectOutput(x, false) -> not reentrant
> marshaller.finishObjectOutput()
> --
>
> Why do we use this? To enable marshaller implementations to return a
> different ObjectOutput if the call is reentrant. If you look at
> org.infinispan.marshall.jboss.JBossMarshaller you see that the
> ObjectOutput (or org.jboss.marshalling.Marshaller) is a ThreadLocal, but
> JBossMarshaller does not allow for the same
> org.jboss.marshalling.Marshaller to be opened twice. So, by using the
> reentrant flag, we can make sure that the 2nd time that startObjectOutput
> is called, a different one is provided.
>
> For an example of reentrancy, see the javadoc:
>
> * <p>On the other hand, when a call is reentrant, i.e.
> startObjectOutput/startObjectOutput(reentrant)...finishObjectOutput/finishObjectOutput,
> * the Marshaller implementation might treat it differently. An example
> of reentrancy would be marshalling of {@link MarshalledValue}.
> * When sending or storing a MarshalledValue, a call to
> startObjectOutput() would occur so that the stream is open and
> * following, a 2nd call could occur so that MarshalledValue's raw byte
> array version is calculated and sent accross.
> * This enables lazy deserialization on the receiver side which is
> performance gain. The Marshaller implementation could decide
> * that it needs a separate ObjectOutput or similar for the 2nd call
> since it's aim is only to get the raw byte array version
> * and the close finish with it.</p>
>
> The second reentrant call is the one to create the MarshalledValue form of
> the in memory data. The first call would be the stream opened to send the
> put or get or whichever op you're sending around.
>
> As a side note, using ThreadLocal is a much cleaner solution to having to
> maintain a pool of org.jboss.marshalling.Marshaller instances.
>
> Hope this clarifies further what the reentrant stuff does.
>
> Cheers,
>
>> The other is the ByteBuffer stuff, no concurrency problem here, but it
>> looks like more work.
>>
>> WDYT ?
>>
>> Cheers,
>>
>> phil
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
> --
> Galder Zamarreño
> Sr. Software Engineer
> Infinispan, JBoss Cache
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
More information about the infinispan-dev
mailing list