How to properly handle decoder chain in netty?

Christian Migowski chrismfwrd at gmail.com
Wed Jun 3 15:02:42 EDT 2009


just a short remark:

tcp is all about guaranteed delivery, so although it is theorectically
possible that the content of tcp packets can be changed during
transmission, it is very unlikely to happen in practice because each
tcp packet is indexed and individually replied ON THE TCP level.  It
can be turned off but normally tcp packets are checksummed so
corruption is even more unlikely.

Frederic, your assumption about http and ftp transfers is not correct,
since ftp is also based on tcp and does not do any checksums on the
data transfer this can even be easily understand. the difference
between http and ftp transfer of huge files is that most ftp server
implementations support continuation of aborted transfers, ie. if the
connection aborts you don't have to start over the whole transfer
again, whereas most http servers do not support transfer continuation
and just start over again.

If you use tcp you can be reasonably sure that the data you transfer
is correct and in the same sequence the remote peer sent it. Except
for things where higher security is mandatory (like doing financial
transactions) you normally don't have to do extra checksum checks with
tcp (of course you are free do to so).

I am still thinking that there is something wrong with Jiangs decoder
when more chunks are sent. You can do a network trace on the server
and see what was received, if it looks correct there is a problem with
the server.


regards,
christian


On Wed, Jun 3, 2009 at 8:13 PM, Frederic Bregier <fredbregier at free.fr> wrote:
>
> Jiang,
>
> Happy to help someone there... ;-)
>
> Just a precision of what I have in mind:
> If you do have such MD5, do you use it on the full file or only by chunk?
> If using by chunk, then it is only this chunk that has to be resent.
> However, then, you will perhaps have a protocol that will do something like:
> - client sends 1 chunk with its MD5 and its rank
> - server acquires the chunk, verifies the rank and the md5 of the chunk
> - if something goes wrong (bad rank, bad md5), ask the client to resend the
> last chunk
> - if ok, then just ack the chunk
> The problem is then your send will be somehow "synchronous" since each chunk
> will have to be validated by the server. The easy part is that resend the
> last bad chunk is then easy. The bad part is that you will rely on latency
> on the network (chunk send then waiting for acknowledge from the server
> before sending the next chunk). But you will have a very secure transfer
> file then...
> Perhaps to have less acknowledge (depending on the size of chunk), you could
> perhaps acknowledge them by packet (for instance each 10 chunk, except if
> one goes wrong where you immediately send the acknowledge for the good ones
> and a bad acknowledge for the bad one, restarting the transfer from this
> chunk).
>
> In fact, I'm full of idea on this since I'm currently writing such a file
> transfer monitor for production on "secure" IT... In my case, efficiency is
> not the most important, security and restarting without resending the full
> file is the most important.
>
> HTH,
> Frederic
>
> Jiang Bian wrote:
>>
>> Frederic -
>>
>> Always appreciate your quick response.
>>
>> Yes, I do have those features implemented already, the md5 checksum (i.e.
>> it might be a overkill, but i just don't want to worry about it), index
>> number of each chunk, etc.
>>
>> The protocol works, and the file did go through eventually (of course the
>> server will request to send the corrupted data again). I just feel annoyed
>> by the corruption. Now, it seems that I have to live with it.
>>
>> Thanks again for your help!
>>
>> Jiang
>>
>>
>> Frederic Bregier wrote:
>>>
>>> Hi Jiang,
>>>
>>> I think that corrupted data on huge transfer is probably normal.
>>>
>>> My view is based on the following.
>>> When you want to download a huge file (say a ISO file of your favorite
>>> linux distribution, about 4 GB), you have often three pointers:
>>> - one link to an HTTP download link with a "warning" that says "you might
>>> better want to use the FTP protocol since HTTP download can corrupt data"
>>> - one link to an FTP download link
>>> - one link to a MD5 file with a "warning" saying "for such huge file, you
>>> should verify the correctness of your download with the following MD5
>>> key"
>>>
>>> So based on this example, my guess is that sending huge file (by chunk or
>>> not) can lead to some bad transfer that TCP/IP (or even more UDP) cannot
>>> address.
>>> That's why most of the "financial" transfer file software integrate a
>>> checksum during the transfer to validate each file (or even each chunk).
>>>
>>> My suggestion could be the following (but take it only as a suggestion),
>>> you could perhaps integrate something like a checksum in your protocol,
>>> in two ways:
>>> - either at the end of the transfer, you send the checksum (MD5 for
>>> instance) and compares it on the target host (each checksum computed
>>> probably at both side during the transfer in order to prevent a new read
>>> of the file)
>>> - either for each chunk, you send too a checksum of this chunk to be
>>> compared on the remote host
>>>
>>> You can even include a count number that orders the chunks (for instance
>>> to see if there is an error in the transmission where a chunk number 1024
>>> occurs after a chunk number 1021, meaning the chunks with numbers 1022
>>> and 1023 are missing and to enable the restart from chunk 1022).
>>>
>>> But don't go too fast, because including such behaviour in your protocol
>>> will increase greatly the computation (checksum, order of chunk, retry,
>>> ...) and so decrease the efficiency. In such problem, it is always a
>>> choice between efficiency and security, depending on your final goals...
>>>
>>> HTH,
>>> Frederic
>>>
>>>
>>
>>
>
>
> -----
> Hardware/Software Architect
> --
> View this message in context: http://n2.nabble.com/How-to-properly-handle-decoder-chain-in-netty--tp3015408p3020051.html
> Sent from the Netty User Group mailing list archive at Nabble.com.
>
> _______________________________________________
> netty-users mailing list
> netty-users at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/netty-users
>



More information about the netty-users mailing list