[infinispan-dev] storing in memory data in binary format

Mircea Markus mircea.markus at jboss.com
Fri Oct 2 10:57:51 EDT 2009


On Oct 2, 2009, at 5:13 PM, Manik Surtani wrote:

> LazyMarshalling (use of marshalled values internally) does precisely
> this.  If MarshalledValues are used, then calculating the size is
> quite easy, except that it may always spike to 2x the byte[] when both
> forms (serialized and unserialized) are used.  This is a spike though
> and one form is always cleared out at the end of every invocation.
Marshalled values is different as it either keeps the serialized form  
or the Object.
The approach I am talking about is to always keep the serialized form  
only, and deserialize for each read. This is needed in order to  
accurately determine the size of the cached objects.
>
>
>
> On 2 Oct 2009, at 12:44, Mircea Markus wrote:
>
>> Hi,
>>
>> While working on the coherence config converter, I've seen that they
>> are able to specify how much memory a cache should use at a time (we
>> also had a thought about this in past but abandoned the idea mainly
>> due to performance). E.g. :
>> <backing-map-scheme>
>>  <local-scheme>
>>    <high-units>100m</high-units>
>>    <unit-calculator>BINARY</unit-calculator>
>>    <eviction-policy>LRU</eviction-policy>
>>  </local-scheme>
>> </backing-map-scheme>
>> When 100 MB is reached, data will start to be evicted. I know we
>> support marshaled values, but it's not exactly the same thing.
>> I've been wondering how do they achieve this, and I've found this http://coherence.oracle.com/display/COH35UG/Storage+and+Backing+Map
>> .
>> The way it works (my understanding!) is by keeping the key + values  
>> in
>> serialized form within the map. So, if you have both the key and  
>> value
>> as a byte[], you can easily measure the memory fingerprint of the
>> cached data.
>> Now what about keeping data in maps in serialized form?
>> Pros:
>> - we would be able to support memory based eviction triggers.
>> - in DIST mode, when doing a put we won't need to deserialize the  
>> data
>> at the other end. This de-serialization might be redundant, as if
>> another node asks for this data, we'll have to serialize it back etc.
>> - the sync puts would be faster, as the data gets only serialized  
>> (and
>> doesn't get deserialized at the other end).
>> - ???
>> Cons:
>> - data would be deserialized for each get request, adding a latency.
>> Partially compensated by faster puts (see cons) and can be mitigated
>> by using L1 caches (near caches)
>> - ???
>>
>> Well I'm not even sure that this fits with our actual architecture,
>> just brought this in for brainstorming :)
>>
>> Cheers,
>> Mircea
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> --
> Manik Surtani
> manik at jboss.org
> Lead, Infinispan
> Lead, JBoss Cache
> http://www.infinispan.org
> http://www.jbosscache.org
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev




More information about the infinispan-dev mailing list