How to properly handle decoder chain in netty?

Frederic Bregier fredbregier at free.fr
Wed Jun 3 14:13:44 EDT 2009


Jiang,

Happy to help someone there... ;-)

Just a precision of what I have in mind:
If you do have such MD5, do you use it on the full file or only by chunk?
If using by chunk, then it is only this chunk that has to be resent.
However, then, you will perhaps have a protocol that will do something like:
- client sends 1 chunk with its MD5 and its rank
- server acquires the chunk, verifies the rank and the md5 of the chunk
- if something goes wrong (bad rank, bad md5), ask the client to resend the
last chunk
- if ok, then just ack the chunk
The problem is then your send will be somehow "synchronous" since each chunk
will have to be validated by the server. The easy part is that resend the
last bad chunk is then easy. The bad part is that you will rely on latency
on the network (chunk send then waiting for acknowledge from the server
before sending the next chunk). But you will have a very secure transfer
file then...
Perhaps to have less acknowledge (depending on the size of chunk), you could
perhaps acknowledge them by packet (for instance each 10 chunk, except if
one goes wrong where you immediately send the acknowledge for the good ones
and a bad acknowledge for the bad one, restarting the transfer from this
chunk).

In fact, I'm full of idea on this since I'm currently writing such a file
transfer monitor for production on "secure" IT... In my case, efficiency is
not the most important, security and restarting without resending the full
file is the most important.

HTH,
Frederic

Jiang Bian wrote:
> 
> Frederic -
> 
> Always appreciate your quick response.
> 
> Yes, I do have those features implemented already, the md5 checksum (i.e.
> it might be a overkill, but i just don't want to worry about it), index
> number of each chunk, etc.
> 
> The protocol works, and the file did go through eventually (of course the
> server will request to send the corrupted data again). I just feel annoyed
> by the corruption. Now, it seems that I have to live with it.
> 
> Thanks again for your help!
> 
> Jiang
> 
> 
> Frederic Bregier wrote:
>> 
>> Hi Jiang,
>> 
>> I think that corrupted data on huge transfer is probably normal.
>> 
>> My view is based on the following.
>> When you want to download a huge file (say a ISO file of your favorite
>> linux distribution, about 4 GB), you have often three pointers:
>> - one link to an HTTP download link with a "warning" that says "you might
>> better want to use the FTP protocol since HTTP download can corrupt data" 
>> - one link to an FTP download link
>> - one link to a MD5 file with a "warning" saying "for such huge file, you
>> should verify the correctness of your download with the following MD5
>> key"
>> 
>> So based on this example, my guess is that sending huge file (by chunk or
>> not) can lead to some bad transfer that TCP/IP (or even more UDP) cannot
>> address.
>> That's why most of the "financial" transfer file software integrate a
>> checksum during the transfer to validate each file (or even each chunk).
>> 
>> My suggestion could be the following (but take it only as a suggestion),
>> you could perhaps integrate something like a checksum in your protocol,
>> in two ways:
>> - either at the end of the transfer, you send the checksum (MD5 for
>> instance) and compares it on the target host (each checksum computed
>> probably at both side during the transfer in order to prevent a new read
>> of the file)
>> - either for each chunk, you send too a checksum of this chunk to be
>> compared on the remote host
>> 
>> You can even include a count number that orders the chunks (for instance
>> to see if there is an error in the transmission where a chunk number 1024
>> occurs after a chunk number 1021, meaning the chunks with numbers 1022
>> and 1023 are missing and to enable the restart from chunk 1022).
>> 
>> But don't go too fast, because including such behaviour in your protocol
>> will increase greatly the computation (checksum, order of chunk, retry,
>> ...) and so decrease the efficiency. In such problem, it is always a
>> choice between efficiency and security, depending on your final goals...
>> 
>> HTH,
>> Frederic
>> 
>> 
> 
> 


-----
Hardware/Software Architect
-- 
View this message in context: http://n2.nabble.com/How-to-properly-handle-decoder-chain-in-netty--tp3015408p3020051.html
Sent from the Netty User Group mailing list archive at Nabble.com.




More information about the netty-users mailing list