<div>Hi Frederic,</div>
<div> </div>
<div>You got it all correct in my application. Thanks so much for the help.</div>
<div>Following your suggestion, I was able to maximize the concurrency by adding a ChannelFutureListener on connect operation. In its operationComplete(), channel sent a http request and count down a CountDownLatch since I do need to 'join' all responses.</div>
<div> </div>
<div>So now, I have used 3 loops (vs previously 4 loops), plus more concurrency, like this:</div>
<div> </div>
<div>- Setup the first CountDownLatch</div>
<div>- Loop 1: connect n times with a listener for each connect. In the listener's operationComplete(), request was sent, handler list was added </div>
<div> and a CountDownLatch was counted down.</div>
<div>- wait on the first CountDownLatch</div>
<div>- Loop 2: use the handler list to retrieve each response.</div>
<div>
<div>- Setup the second CountDownLatch</div>
<div>- Loop 3: do a channel.getCloseFuture().addListener. In the listener's operationComplete(), simply count down the CountDownLatch</div>
<div>
<div>- wait on the first CountDownLatch</div>
<div>- bootstrap.releaseExternalResources</div>
<div> </div>
<div>I think I'm now observing now better performance than before AFTER a threshold. In my case, in serving less than 100 requests, multi-threading plus synch Apache HttpClient does better. In serving 100 requests, they break even. In serving 200 and 300 requests, Netty does better. My VMware workstation 6 with Centos 5.3 and 5gb mem in a 4-cores desktop cannot handle more than 300 requests in my simple testing application.</div>
<div> </div>
<div>Next, I'm going to work on HttpChunkAggregator as you suggested. Not sure what else I need to do other than uncommenting out that line in snoop example. I'll look into it.</div>
<div> </div>
<div>Again, thanks so much to guide me as a newbie to this framework.</div>
<div> </div>
<div>Jason</div></div>
<div> </div></div>
<div class="gmail_quote">On Fri, Sep 11, 2009 at 12:30 AM, Frederic Bregier <span dir="ltr"><<a href="mailto:fredbregier@free.fr">fredbregier@free.fr</a>></span> wrote:<br>
<blockquote style="BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex; PADDING-LEFT: 1ex" class="gmail_quote"><br>Hi Jason,<br><br>Again, I'm feeling not able to answer to all, but I will start to answer to<br>some...<br>
<br>One of the interest of the Nio model is the asynchronous part.<br>In your example, if I get it correctly you do something like this:<br>For all host/port<br> connect<br>For all connect<br> wait their finished connection and send request<br>
For all connected<br> wait for the answer for one request in order<br><br>Then you are implementing something in the middle of synchronous and<br>asynchronous.<br>I would have the following idea (using all ChannelFuture capability of<br>
Netty):<br><br>For all host/port<br> connect and add a ChannelFutureListener on the connection done<br><br>In the ChannelFutureListener<br> send the request => each request will add (not necesseraly in order) the<br>
result in your arraylist<br><br>Wait for the list to be full (n requests => n answers or using a countdown<br>from concurrent package)<br><br>Then you can have connection/sending request/receiving request all<br>overlapping between several requests.<br>
<br>What you have done tend to be synchronous, not completely since you overlap<br>connections between all channel connection, but as you are waiting that all<br>are done...<br>You can get it by this "picture":<br>
you create n task (connection)<br>you are waiting that all n task are done (connected) so a global<br>synchronisation of all threads<br>then you create n task (request)<br>you are waiting that all n task are done (answered) so again a global<br>
synchronisation<br><br>What I suggest is:<br>you create n task (connection), they will continue by sending the request<br>(no synchronisation)<br>you are waiting that all n task (connected and answered) are done (on any<br>
order) so a global synchronisation but based on the slowest answer from<br>remote host.<br><br>Of course, if you can avoid to wait for all answers to be there and work<br>with each answer one by one, then you can even avoid such a global wait on<br>
the slowest answer. But it depends on you business logic there...<br><br>Reusing connection is not quite possible in Netty but there is some<br>handlers/code that allow reconnection (Trustin made an example a few days<br>
ago posted in the ML).<br><br>Now for the chunk part, yes chunk should be supported by any HTTP server.<br>The reason is that when a request is bigger than 8KB, it is supposed to be<br>chunked.<br>However, there is in Netty an handler (HttpChunkAggregator) that allow you<br>
to get the full body (only the body is concerned by chunk) in one<br>ChannelBuffer. This handler does accumulating of all chunks up to the last<br>one and returns to the next handler when it is completed.<br>It is obviously simplest for a standard program.<br>
However take care of one thing, it means that if you have 100 requests and<br>if all requests sends 1MB of body, then you will have 100 MB in memory (at<br>least) since it will store all bodies in memory until they finished to<br>
decode all chunks.<br>In my work, I use the Http codec chunk by chunk since I am able with my<br>business model to handle data chunk by chunk so keeping the memory as low as<br>possible.<br>But if it is not your case, just use the HttpChunkAggregator handler, it<br>
works perfectly and then you can ignore if the answer is chunked or not. In<br>the snoop example there is an example on how to use it.<br><br>HTH,<br>Frederic<br>
<div>
<div></div>
<div class="h5"><br><br>J. Mi wrote:<br>><br>> Thanks to Frederic for the overview. It's very helpful for me.<br>><br>> I have come up with an approach to replace my multi-thread model with<br>> Netty's HttpClient. It's pretty much based on the snoop example. I<br>
> just added 3 loops to achieve the concurrency (multiple http requests at<br>> the<br>> same time). The first loop was around the call to bootstrap.connect(new<br>> InetSocketAddress(host, port)). The second loop was waiting for each<br>
> connection attempt to succeed and then send the request. The third loop<br>> was<br>> using the handler to retrieve each http response by using a<br>> LinkedBlockingQueue. I used ArrayList to maintain a list for<br>
> ChannelFuture,<br>> a list for Channel and a list for HttpResponseHandler among these 3 loops.<br>><br>> Everything worked well for me with the approach. However, my test result<br>> didn't seem to show this approach out-perform my multi-thread model, i.e.<br>
> one thread (java.util.concurrent) for each http request which was done by<br>> Apache Commons HttpClient (a synchronous model). My performance was<br>> measured<br>> by timing the total time spent in making n http requests and retrieving<br>
> this<br>> n http responses end-to-end.<br>><br>> With requests below 50, the multi-thread model performed a little better.<br>> I<br>> was hoping Netty's way can catch up for better scaling because I was<br>
> concerned about the current muti-thread model may not scale well when<br>> getting hundreds requests at the same time. But I still failed to observe<br>> any increased performance relative to the multi-thread model beyond<br>
> serving<br>> 50, 100, 200...800 concurrent requests.<br>><br>> One thing I need to understand more (Frederic already touched some basics<br>> here) is about the connection management. I felt that Apache Commons<br>
> HttpClient seemed to manage the connection with possible reuse. Not<br>> exactly<br>> sure about how Netty does that.<br>><br>> One more question about Netty's HttpClient. In its<br>> HttpResponseHandler.java,<br>
> messageReceived() method only receives a portion of response at a time and<br>> has a dependence on server's responding with "chunked' Transfer-Encoding<br>> header and content for an end of response condition. This raised 2<br>
> questions: (1) is there a way to receive response in one shot, like<br>> Apache's<br>> HttpClient; and (2) do all Http server required to respond with "chunked"<br>> content? In my case, I need to retrieve online responses from different<br>
> web<br>> sites.<br>><br>> Cheers,<br>> Jason<br>><br>><br>><br>> On Thu, Sep 10, 2009 at 6:45 AM, Frederic Bregier<br>> <<a href="mailto:fredbregier@free.fr">fredbregier@free.fr</a>>wrote:<br>
><br>>><br>>> Hi,<br>>><br>>> I will not talk about the specific Http part of Netty but about its main<br>>> interest, the NIO of Netty.<br>>> Of course, Trustin or others can be more precised than me. It is just my<br>
>> general comprehension (I'm not a Nio expert neither a Netty expert, so it<br>>> is<br>>> just my comprehension as an end user).<br>>><br>>> To compare to a standard Blocking IO, Netty uses less threads to manage<br>
>> the<br>>> same behaviour.<br>>> For instance, if you think about Apache or Tomcat, one connection will be<br>>> handled by at least one thread through the full life of the connection.<br>>> So<br>
>> if you have 1000 connections, you will have at least 1000 threads.<br>>> In Netty, a thread will be active when data arrives into the server (the<br>>> general idea is greatly simplified here, it is not to take it as the<br>
>> reality). For instance, for those 1000 connections, maybe at most 100 are<br>>> really sending something on the same time to the server, so around 100<br>>> threads will be used. Netty does something like reusing threads, whatever<br>
>> the connection is.<br>>><br>>> Another point of course is the non blocking way. Once you send something,<br>>> you have the choice to continue the job without waiting that the data is<br>>> really sent (of course, you have to take care about it for instance<br>
>> before<br>>> closing the channel). So you can overlap sending data with other<br>>> computations (for instance for next packet to be sent).<br>>> Compares to blocking IO, of course, there you wait for the data to be<br>
>> really<br>>> sent (or at least buffered).<br>>><br>>> So in many points, Netty approach should have more performance than<br>>> blocking<br>>> IO. I said "should" since there exist some counter examples where<br>
>> blocking<br>>> IO are faster, since NIO introduces some extra computing comparing to<br>>> blocking IO. However most of the time, these extra are masked by the<br>>> implementation of Netty and are quicker than blocking IO. But I recall<br>
>> some<br>>> examples however.<br>>><br>>> Also, Netty can have different kind of transport (nio, oio, ...), so the<br>>> behaviour can be different according to one or another low network<br>
>> transport<br>>> framework.<br>>><br>>> This is not the full idea of Netty, but a start of answer to your<br>>> question.<br>>> For more information, either other people can continue this thread (or<br>
>> correct where I a wrong), and of course you can read the examples that<br>>> are<br>>> in Netty (even those not about Http) and the documentation of Netty.<br>>><br>>> HTH,<br>>> Cheers,<br>
>> Frederic<br>>><br>>> J. Mi wrote:<br>>> ><br>>> > Hi all,<br>>> ><br>>> > I guess my fundamental question here is if, in theory at least, Netty<br>>> > provides a better asynchronous mechanism than the concurrent java<br>
>> package<br>>> > from java.util.concurrent.* in terms of performance. Does internally<br>>> Netty<br>>> > use multi-threading, java.nio, or both, or neither?<br>>> ><br>>> > If Netty does better than java.util.concurrent.* for performance, is<br>
>> there<br>>> > any example, tutorial, which can guide me a little for replacing my<br>>> > current<br>>> > multi-threading process which I described in that previous email?<br>>> ><br>
>> > Many thanks to you for sharing your expertise,<br>>> > Jason<br>>> ><br>>> > On Wed, Sep 2, 2009 at 12:11 PM, J. Mi <<a href="mailto:jmi258@gmail.com">jmi258@gmail.com</a>> wrote:<br>
>> ><br>>> >> Hi folks,<br>>> >><br>>> >> Currently, my application's process flow logic is like this:<br>>> >><br>>> >> -> A controlling process receives one request for data which will be<br>
>> >> fetched from multiple online sources.<br>>> >> -> The controlling process spawns multiple threads. Each of these<br>>> threads<br>>> >> will (1) use Apache synchronous commons httpclient to fetch the data;<br>
>> (2)<br>>> >> parse the data; and (3)<br>>> >> return the data to the controlling process.<br>>> >> -> The controlling process joins all threads and return the combined<br>
>> data<br>>> >> to the requestor.<br>>> >><br>>> >> So basically, each thread uses a synchronous httpclient to fetch the<br>>> data<br>>> >> and then parse it.<br>
>> >><br>>> >> In reading org.jboss.netty.example.http.snoop package, I have the<br>>> >> following question:<br>>> >> If I just replace the Apache's synchronous httpclient with Nettty's<br>
>> >> org.jboss.netty.handler.codec.http.* as the example does, will I be<br>>> >> benefited performance-wise? I heard something about blocking I/O hurts<br>>> >> multi-threading. If so, should Netty's package work better for me?<br>
>> >><br>>> >> Or should I actually get ride of the existing multi-threading by using<br>>> >> Netty's framework? If so, which of your examples can be better<br>>> referenced<br>
>> >> for my purpose?<br>>> >><br>>> >> Thanks for your in advance,<br>>> >> Jason<br>>> >><br>>> >><br>>> >><br>>> ><br>>> > _______________________________________________<br>
>> > netty-dev mailing list<br>>> > <a href="mailto:netty-dev@lists.jboss.org">netty-dev@lists.jboss.org</a><br>>> > <a href="https://lists.jboss.org/mailman/listinfo/netty-dev" target="_blank">https://lists.jboss.org/mailman/listinfo/netty-dev</a><br>
>> ><br>>> ><br>>><br>>><br>>> -----<br>>> Hardware/Software Architect<br>>> --<br>>> View this message in context:<br>>> <a href="http://n2.nabble.com/A-question-about-your-HttpClient-example-and-beyond-tp3568879p3617420.html" target="_blank">http://n2.nabble.com/A-question-about-your-HttpClient-example-and-beyond-tp3568879p3617420.html</a><br>
>> Sent from the Netty Developer Group mailing list archive at Nabble.com.<br>>> _______________________________________________<br>>> netty-dev mailing list<br>>> <a href="mailto:netty-dev@lists.jboss.org">netty-dev@lists.jboss.org</a><br>
>> <a href="https://lists.jboss.org/mailman/listinfo/netty-dev" target="_blank">https://lists.jboss.org/mailman/listinfo/netty-dev</a><br>>><br>><br>> _______________________________________________<br>> netty-dev mailing list<br>
> <a href="mailto:netty-dev@lists.jboss.org">netty-dev@lists.jboss.org</a><br>> <a href="https://lists.jboss.org/mailman/listinfo/netty-dev" target="_blank">https://lists.jboss.org/mailman/listinfo/netty-dev</a><br>
><br>><br><br><br>-----<br>Hardware/Software Architect<br>--<br></div></div>View this message in context: <a href="http://n2.nabble.com/A-question-about-your-HttpClient-example-and-beyond-tp3568879p3624150.html" target="_blank">http://n2.nabble.com/A-question-about-your-HttpClient-example-and-beyond-tp3568879p3624150.html</a><br>
<div>
<div></div>
<div class="h5">Sent from the Netty Developer Group mailing list archive at Nabble.com.<br>_______________________________________________<br>netty-dev mailing list<br><a href="mailto:netty-dev@lists.jboss.org">netty-dev@lists.jboss.org</a><br>
<a href="https://lists.jboss.org/mailman/listinfo/netty-dev" target="_blank">https://lists.jboss.org/mailman/listinfo/netty-dev</a><br></div></div></blockquote></div><br>