[jbosscache-dev] Critical CacheMarshaller issue

Jason T. Greene jason.greene at redhat.com
Mon Nov 12 12:21:18 EST 2007


Jason T. Greene wrote:
> Manik Surtani wrote:
>>
>> On 8 Nov 2007, at 03:00, Jason T. Greene wrote:
>>
>>> Manik Surtani wrote:
>>>> A nasty bug, spotted by someone in the user forum (initially as a CCE)
>>>>    http://jira.jboss.org/jira/browse/JBCACHE-1211
>>>> Copying from the JIRA:
>>>> "This is a nasty. What started life as an optimisation for certain 
>>>> types of objects in a marshalled stream (Fqn, GlobalTransactio, 
>>>> String and Serializable) has become a major limitation in that a 
>>>> single stream can only hold up to 32767 different (not equal()) 
>>>> instances of such objects.
>>>> Basically the optimisation was, for example, instead of writing 
>>>> "hello" to a stream twice, just write it once and use a reference 
>>>> for all subsequent times. Unfortunately this reference was encoded 
>>>> as a short, hence the limitation of 32767.
>>>> Fixing this will definitely break wire compatibility with JBoss 
>>>> Cache 2.0.0, although JBC does allow backward compatibility by 
>>>> specifying replication version in your configuration, thanks to the 
>>>> VersionAwareMarshaller. "
>>>> So I guess this mandates the need for a CacheMarshaller210.  The 
>>>> question is how do we fix this.  The obvious thing is to replace the 
>>>> short references with integers.  The 2 ^ 31 - 1 number of references 
>>>> this would allow should be plenty!  The drawback though, is larger 
>>>> streams.  4-byte refs instead of 2-byte refs can be an unnecessary 
>>>> overhead especially if objects aren't repeated much.
>>>
>>> I wouldn't worry too much about the extra bytes. However, you could 
>>> maintain backwards compatibility, and save the 2 bytes, by stealing 
>>> the sign bit on the short. If byte1 & 0x80 then read 3 more bytes, 
>>> else read only 1 more.
>>>
>>
>> Still wouldn't help if you needed a million Strings in a collection.  
>> :-)  
> 
> Sure it would, since you get the full positive rang of a signed int 
> (2^31 - 1). The only difference is that if its <= 32767 you write only 
> two bytes, and when it's greater you write an encoded int that can be 
> detected (only 4 bytes).
> 
> // Writing code
> if (n > 32767) {
>    n = n | 0x80000000;
>    writeInt(n);
> } else {
>    writeShort(n);
> }
> 
> // Reading code
> int n = readShort();
> if ((n & 0x80000000) != 0) {
>   // sign bit rolls off
>   n = n << 16  & readShort();
> }
> 

Correction (Forgot to account for java's retarded auto integer upconvert):

short s = in.readShort();
int n = s;
if ((s & 0x8000) != 0) {
    n = (s & 0x7FFF) << 16 | (in.readShort() & 0xFFFF);
}


-- 
Jason T. Greene
JBoss, a division of Red Hat



More information about the jbosscache-dev mailing list