[Repost] Re: A question about your HttpClient example and beyond

Frederic Bregier fredbregier at free.fr
Fri Sep 11 03:30:45 EDT 2009


Hi Jason,

Again, I'm feeling not able to answer to all, but I will start to answer to
some...

One of the interest of the Nio model is the asynchronous part.
In your example, if I get it correctly you do something like this:
For all host/port
    connect
For all connect
    wait their finished connection and send request
For all connected
    wait for the answer for one request in order

Then you are implementing something in the middle of synchronous and
asynchronous.
I would have the following idea (using all ChannelFuture capability of
Netty):

For all host/port
    connect and add a ChannelFutureListener on the connection done

In the ChannelFutureListener
    send the request => each request will add (not necesseraly in order) the
result in your arraylist

Wait for the list to be full (n requests => n answers or using a countdown
from concurrent package)

Then you can have connection/sending request/receiving request all
overlapping between several requests.

What you have done tend to be synchronous, not completely since you overlap
connections between all channel connection, but as you are waiting that all
are done...
You can get it by this "picture":
you create n task (connection)
you are waiting that all n task are done (connected) so a global
synchronisation of all threads
then you create n task (request)
you are waiting that all n task are done (answered) so again a global
synchronisation

What I suggest is:
you create n task (connection), they will continue by sending the request
(no synchronisation)
you are waiting that all n task (connected and answered) are done (on any
order) so a global synchronisation but based on the slowest answer from
remote host.

Of course, if you can avoid to wait for all answers to be there and work
with each answer one by one, then you can even avoid such a global wait on
the slowest answer. But it depends on you business logic there...

Reusing connection is not quite possible in Netty but there is some
handlers/code that allow reconnection (Trustin made an example a few days
ago posted in the ML).

Now for the chunk part, yes chunk should be supported by any HTTP server.
The reason is that when a request is bigger than 8KB, it is supposed to be
chunked.
However, there is in Netty an handler (HttpChunkAggregator) that allow you
to get the full body (only the body is concerned by chunk) in one
ChannelBuffer. This handler does accumulating of all chunks up to the last
one and returns to the next handler when it is completed.
It is obviously simplest for a standard program.
However take care of one thing, it means that if you have 100 requests and
if all requests sends 1MB of body, then you will have 100 MB in memory (at
least) since it will store all bodies in memory until they finished to
decode all chunks.
In my work, I use the Http codec chunk by chunk since I am able with my
business model to handle data chunk by chunk so keeping the memory as low as
possible. 
But if it is not your case, just use the HttpChunkAggregator handler, it
works perfectly and then you can ignore if the answer is chunked or not. In
the snoop example there is an example on how to use it.

HTH,
Frederic


J. Mi wrote:
> 
> Thanks to Frederic for the overview. It's very helpful for me.
> 
> I have come up with an approach to replace my multi-thread model with
> Netty's HttpClient. It's pretty much based on the snoop example. I
> just added 3 loops to achieve the concurrency (multiple http requests at
> the
> same time). The first loop was around the call to bootstrap.connect(new
> InetSocketAddress(host, port)). The second loop was waiting for each
> connection attempt to succeed and then send the request. The third loop
> was
> using the handler to retrieve each http response by using a
> LinkedBlockingQueue. I used ArrayList to maintain a list for
> ChannelFuture,
> a list for Channel and a list for HttpResponseHandler among these 3 loops.
> 
> Everything worked well for me with the approach. However, my test result
> didn't seem to show this approach out-perform my multi-thread model, i.e.
> one thread (java.util.concurrent) for each http request which was done by
> Apache Commons HttpClient (a synchronous model). My performance was
> measured
> by timing the total time spent in making n http requests and retrieving
> this
> n http responses end-to-end.
> 
> With requests below 50, the multi-thread model performed a little better.
> I
> was hoping Netty's way can catch up for better scaling because I was
> concerned about the current muti-thread model may not scale well when
> getting hundreds requests at the same time. But I still failed to observe
> any increased performance relative to the multi-thread model beyond
> serving
> 50, 100, 200...800 concurrent requests.
> 
> One thing I need to understand more (Frederic already touched some basics
> here) is about the connection management. I felt that Apache Commons
> HttpClient seemed to manage the connection with possible reuse. Not
> exactly
> sure about how Netty does that.
> 
> One more question about Netty's HttpClient. In its
> HttpResponseHandler.java,
> messageReceived() method only receives a portion of response at a time and
> has a dependence on server's responding with "chunked' Transfer-Encoding
> header and content for an end of response condition. This raised 2
> questions: (1) is there a way to receive response in one shot, like
> Apache's
> HttpClient; and (2) do all Http server required to respond with "chunked"
> content? In my case, I need to retrieve online responses from different
> web
> sites.
> 
> Cheers,
> Jason
> 
> 
> 
> On Thu, Sep 10, 2009 at 6:45 AM, Frederic Bregier
> <fredbregier at free.fr>wrote:
> 
>>
>> Hi,
>>
>> I will not talk about the specific Http part of Netty but about its main
>> interest, the NIO of Netty.
>> Of course, Trustin or others can be more precised than me. It is just my
>> general comprehension (I'm not a Nio expert neither a Netty expert, so it
>> is
>> just my comprehension as an end user).
>>
>> To compare to a standard Blocking IO, Netty uses less threads to manage
>> the
>> same behaviour.
>> For instance, if you think about Apache or Tomcat, one connection will be
>> handled by at least one thread through the full life of the connection.
>> So
>> if you have 1000 connections, you will have at least 1000 threads.
>> In Netty, a thread will be active when data arrives into the server (the
>> general idea is greatly simplified here, it is not to take it as the
>> reality). For instance, for those 1000 connections, maybe at most 100 are
>> really sending something on the same time to the server, so around 100
>> threads will be used. Netty does something like reusing threads, whatever
>> the connection is.
>>
>> Another point of course is the non blocking way. Once you send something,
>> you have the choice to continue the job without waiting that the data is
>> really sent (of course, you have to take care about it for instance
>> before
>> closing the channel). So you can overlap sending data with other
>> computations (for instance for next packet to be sent).
>> Compares to blocking IO, of course, there you wait for the data to be
>> really
>> sent (or at least buffered).
>>
>> So in many points, Netty approach should have more performance than
>> blocking
>> IO. I said "should" since there exist some counter examples where
>> blocking
>> IO are faster, since NIO introduces some extra computing comparing to
>> blocking IO. However most of the time, these extra are masked by the
>> implementation of Netty and are quicker than blocking IO. But I recall
>> some
>> examples however.
>>
>> Also, Netty can have different kind of transport (nio, oio, ...), so the
>> behaviour can be different according to one or another low network
>> transport
>> framework.
>>
>> This is not the full idea of Netty, but a start of answer to your
>> question.
>> For more information, either other people can continue this thread (or
>> correct where I a wrong), and of course you can read the examples that
>> are
>> in Netty (even those not about Http) and the documentation of Netty.
>>
>> HTH,
>> Cheers,
>> Frederic
>>
>> J. Mi wrote:
>> >
>> > Hi all,
>> >
>> > I guess my fundamental question here is if, in theory at least, Netty
>> > provides a better asynchronous mechanism than the concurrent java
>> package
>> > from java.util.concurrent.* in terms of performance. Does internally
>> Netty
>> > use multi-threading, java.nio, or both, or neither?
>> >
>> > If Netty does better than java.util.concurrent.* for performance, is
>> there
>> > any example, tutorial, which can guide me a little for replacing my
>> > current
>> > multi-threading process which I described in that previous email?
>> >
>> > Many thanks to you for sharing your expertise,
>> > Jason
>> >
>> > On Wed, Sep 2, 2009 at 12:11 PM, J. Mi <jmi258 at gmail.com> wrote:
>> >
>> >> Hi folks,
>> >>
>> >> Currently, my application's process flow logic is like this:
>> >>
>> >> -> A controlling process receives one request for data which will be
>> >> fetched from multiple online sources.
>> >> -> The controlling process spawns multiple threads. Each of these
>> threads
>> >> will (1) use Apache synchronous commons httpclient to fetch the data;
>> (2)
>> >> parse the data; and (3)
>> >>     return the data to the controlling process.
>> >> -> The controlling process joins all threads and return the combined
>> data
>> >> to the requestor.
>> >>
>> >> So basically, each thread uses a synchronous httpclient to fetch the
>> data
>> >> and then parse it.
>> >>
>> >>  In reading org.jboss.netty.example.http.snoop package, I have the
>> >> following question:
>> >> If I just replace the Apache's synchronous httpclient with Nettty's
>> >> org.jboss.netty.handler.codec.http.* as the example does, will I be
>> >> benefited performance-wise? I heard something about blocking I/O hurts
>> >> multi-threading. If so, should Netty's package work better for me?
>> >>
>> >> Or should I actually get ride of the existing multi-threading by using
>> >> Netty's framework? If so, which of your examples can be better
>> referenced
>> >> for my purpose?
>> >>
>> >> Thanks for your in advance,
>> >> Jason
>> >>
>> >>
>> >>
>> >
>> > _______________________________________________
>> > netty-dev mailing list
>> > netty-dev at lists.jboss.org
>> > https://lists.jboss.org/mailman/listinfo/netty-dev
>> >
>> >
>>
>>
>> -----
>> Hardware/Software Architect
>> --
>> View this message in context:
>> http://n2.nabble.com/A-question-about-your-HttpClient-example-and-beyond-tp3568879p3617420.html
>> Sent from the Netty Developer Group mailing list archive at Nabble.com.
>> _______________________________________________
>> netty-dev mailing list
>> netty-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/netty-dev
>>
> 
> _______________________________________________
> netty-dev mailing list
> netty-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/netty-dev
> 
> 


-----
Hardware/Software Architect
-- 
View this message in context: http://n2.nabble.com/A-question-about-your-HttpClient-example-and-beyond-tp3568879p3624150.html
Sent from the Netty Developer Group mailing list archive at Nabble.com.


More information about the netty-dev mailing list