On Mon, May 23, 2011 at 8:05 PM, Sanne Grinovero
<sanne.grinovero(a)gmail.com> wrote:
2011/5/23 "이희승 (Trustin Lee)" <trustin(a)gmail.com>:
> On 05/23/2011 07:40 PM, Sanne Grinovero wrote:
>> 2011/5/23 Dan Berindei<dan.berindei(a)gmail.com>:
>>> On Mon, May 23, 2011 at 7:04 AM, "이희승 (Trustin
Lee)"<trustin(a)gmail.com> wrote:
>>>> On 05/20/2011 03:54 PM, Manik Surtani wrote:
>>>>> Is spanning rows the only real solution? As you say it would mandate
using transactions to keep multiple rows coherent, and 'm not sure if everyone would
want to enable transactions for this.
>>>>
>>>> There are more hidden overheads. To update a value, the cache store
>>>> must determine how many chunks already exists in the cache store and
>>>> selectively delete and update them. To simply aggressively, we could
>>>> delete all chunks and insert new chunks. Both at the cost of great
>>>> overhead.
>>
>> I see no alternative to delete all values for each key, as we don't
>> know which part of the byte array is dirty;
>> At which overhead are you referring? We would still store the same
>> amount of data, slit or not split, but yes multiple statements might
>> require clever batching.
>>
>>>>
>>>> Even MySQL supports a blog up to 4GiB, so I think it's better update
the
>>>> schema?
>>
>> You mean by accommodating the column size only, or adding the chunk_id ?
>> I'm just asking, but all of yours and Dan's feedback have already
>> persuaded me that my initial idea of providing chunking should be
>> avoided.
>
> I mean user's updating the column type of the schema.
>
>>>
>>> +1
>>>
>>> BLOBs are only stored in external storage if the actual data can't fit
>>> in a normal table row, so the only penalty in using a LONGBLOB
>>> compared to a VARBINARY(255) is 3 extra bytes for the length.
>>>
>>> If the user really wants to use a data type with a smaller max length,
>>> we can just report an error when the data column size is too small. We
>>> will need to check the length and throw an exception ourselves though,
>>> with MySQL we can't be sure that it is configured to raise errors when
>>> a value is truncated.
>>
>> +1
>> it might be better to just check for the maximum size of stored values
>> to fit in "something"; I'm not sure if we can guess the proper
size
>> from database metadata: not only the column maximum size is involved,
>> but MySQL (to keep it as reference example, but might apply to others)
>> also has a default maximum packet size for the connections which is
>> not very big, when using it with Infinispan I always had to
>> reconfigure the database server.
>>
>> Also as BLOBs are very poor as primary key, people might want to use a
>> limited and well known byte size for their keys.
>>
>> So, shall we just add a method to check to not have surpassed a user
>> defined threshold, checking for both key and value but on different
>> configurable sizes? Should an exception be raised in that case?
>
> Exception will be raised by JDBC driver if key doesn't fit into the key
> column, so we could simply wrap it?
If that always happens, the I wouldn't wrap it. entering the business
of wrapping driver specific exceptions is very tricky ;)
I was more concerned about the fact that some database might not raise
any exception ? Not sure if that's the case, and possibly not our
problem.
By default MySQL only gives a warning if the value is truncated. We
could throw an exception every time we got a warning from the DB, but
the wrong value has already been inserted in the DB and if the key was
truncated then we don't even have enough information to delete it.
A better option to avoid checking ourselves may be to check on startup
if STRICT_ALL_TABLES
(
http://dev.mysql.com/doc/refman/5.5/en/server-sql-mode.html#sqlmode_stric...)
is enabled with SELECT @(a)SESSION.sql_mode in the MySQL implementation
and refuse to start if it's not. There is another STRICT_TRANS_TABLES
mode, but I don't know how to find out if a table is transactional or
not...
Cheers
Dan
Sanne
>
>>
>> Cheers,
>> Sanne
>>
>>>
>>> Cheers
>>> Dan
>>>
>>>
>>>>> On 19 May 2011, at 19:06, Sanne Grinovero wrote:
>>>>>
>>>>>> As mentioned on the user forum [1], people setting up a JDBC
>>>>>> cacheloader need to be able to define the size of columns to be
used.
>>>>>> The Lucene Directory has a feature to autonomously chunk the
segment
>>>>>> contents at a configurable specified byte number, and so has the
>>>>>> GridFS; still there are other metadata objects which Lucene
currently
>>>>>> doesn't chunk as it's "fairly small" (but
undefined and possibly
>>>>>> growing), and in a more general sense anybody using the JDBC
>>>>>> cacheloader would face the same problem: what's the dimension
I need
>>>>>> to use ?
>>>>>>
>>>>>> While in most cases the maximum size can be estimated, this is
still
>>>>>> not good enough, as when you're wrong the byte array might
get
>>>>>> truncated, so I think the CacheLoader should take care of this.
>>>>>>
>>>>>> what would you think of:
>>>>>> - adding a max_chunk_size option to
JdbcStringBasedCacheStoreConfig
>>>>>> and JdbcBinaryCacheStore
>>>>>> - have them store in multiple rows the values which would be
bigger
>>>>>> than max_chunk_size
>>>>>> - this will need transactions, which are currently not being used
by
>>>>>> the cacheloaders
>>>>>>
>>>>>> It looks like to me that only the JDBC cacheloader has these
issues,
>>>>>> as the other stores I'm aware of are more "blob
oriented". Could it be
>>>>>> worth to build this abstraction in an higher level instead of in
the
>>>>>> JDBC cacheloader?
>>>>>>
>>>>>> Cheers,
>>>>>> Sanne
>>>>>>
>>>>>> [1] -
http://community.jboss.org/thread/166760
>>>>>> _______________________________________________
>>>>>> infinispan-dev mailing list
>>>>>> infinispan-dev(a)lists.jboss.org
>>>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>
>>>>> --
>>>>> Manik Surtani
>>>>> manik(a)jboss.org
>>>>>
twitter.com/maniksurtani
>>>>>
>>>>> Lead, Infinispan
>>>>>
http://www.infinispan.org
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev(a)lists.jboss.org
>>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>
>>>>
>>>> --
>>>> Trustin Lee,
http://gleamynode.net/
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev(a)lists.jboss.org
>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev(a)lists.jboss.org
>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev(a)lists.jboss.org
>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
> --
> Trustin Lee,
http://gleamynode.net/
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev