Question about per-socket concurrency

Yang Zhang yanghatespam at gmail.com
Thu Aug 13 04:27:52 EDT 2009


Yang Zhang wrote:
> Hi, we have a Netty-based server that is running into a bottleneck that 
> we suspect may be due to the way (socket-wise) concurrency control works 
> in the Netty reactor.  While we're exploring the code, we're wondering 
> if anyone has any insight into this so as to expedite our performance 
> debugging.
> 
> The system is a simple topic-based publish-subscribe messaging system, 
> and the workload we're issuing is a handful of publishers publishing 
> messages to the server, all on separate hosts on a GigE LAN.  Each of 
> the publishers tops out at just 1000 messages per second, where each 
> message is 1KB in size.  However, we can keep piling on clients, and the 
> throughput scales up linearly.
> 
>  From this info, one simple explanation would be that the culprit is a 
> bottleneck in the client.  Yet the strange thing is that the CPU 
> utilization of each client is just ~5%.  On the server, CPU utilization 
> hovers at ~20% when presented with a single publisher, and grows another 
> ~20% for each additional publisher.  The bottom line is that we're being 
> held back well before full CPU or network saturation.
> 
> Is there any synchronization in the reactor core of Netty that could be 
> causing this per-socket bottleneck?  Thanks in advance for any hints.

Along these lines, is there any documentation on what the 
threading/concurrency architecture of Netty looks like?  It has a pool 
of NioWorkers that it splays onto an executor thread pool, but beyond 
that things are murky.  Here's what we've learned so far from the source:

After calling ServerBootstrap.bind(), Netty starts a boss thread that 
just accepts new connections and registers them with one of the workers 
from the worker pool in round-robin fashion (pool size defaults to CPU 
count).  Registration just pushes a new register task for a selector 
loop to handle.  All workers, and the boss, are executing via the 
executor thread pool; hence, the executor must support at least two 
simultaneous threads.

The workers take turns running the select loop, which also handles other 
tasks, like register tasks (for these, the selector is properly woken 
up).  As far as I can tell, a worker continues running a loop so long as 
there are interested fd's/keys (i.e. forever).

Furthermore, events seem to be handled in the same thread, via 
processSelectedKeys() -> read()/write().  This would all suggest that 
everything is running in the same thread - which of course can't be the 
case.  Thanks in advance for any clarification.
-- 
Yang Zhang
http://www.mit.edu/~y_z/


More information about the netty-users mailing list