[infinispan-dev] storing in memory data in binary format

Mircea Markus mircea.markus at jboss.com
Fri Oct 2 07:44:41 EDT 2009


Hi,

While working on the coherence config converter, I've seen that they  
are able to specify how much memory a cache should use at a time (we  
also had a thought about this in past but abandoned the idea mainly  
due to performance). E.g. :
<backing-map-scheme>
   <local-scheme>
     <high-units>100m</high-units>
     <unit-calculator>BINARY</unit-calculator>
     <eviction-policy>LRU</eviction-policy>
   </local-scheme>
</backing-map-scheme>
When 100 MB is reached, data will start to be evicted. I know we  
support marshaled values, but it's not exactly the same thing.
I've been wondering how do they achieve this, and I've found this http://coherence.oracle.com/display/COH35UG/Storage+and+Backing+Map 
.
The way it works (my understanding!) is by keeping the key + values in  
serialized form within the map. So, if you have both the key and value  
as a byte[], you can easily measure the memory fingerprint of the  
cached data.
Now what about keeping data in maps in serialized form?
Pros:
- we would be able to support memory based eviction triggers.
- in DIST mode, when doing a put we won't need to deserialize the data  
at the other end. This de-serialization might be redundant, as if  
another node asks for this data, we'll have to serialize it back etc.
- the sync puts would be faster, as the data gets only serialized (and  
doesn't get deserialized at the other end).
- ???
Cons:
- data would be deserialized for each get request, adding a latency.  
Partially compensated by faster puts (see cons) and can be mitigated  
by using L1 caches (near caches)
- ???

Well I'm not even sure that this fits with our actual architecture,  
just brought this in for brainstorming :)

Cheers,
Mircea



More information about the infinispan-dev mailing list