Question about per-socket concurrency
Yang Zhang
yanghatespam at gmail.com
Thu Aug 13 04:27:52 EDT 2009
Yang Zhang wrote:
> Hi, we have a Netty-based server that is running into a bottleneck that
> we suspect may be due to the way (socket-wise) concurrency control works
> in the Netty reactor. While we're exploring the code, we're wondering
> if anyone has any insight into this so as to expedite our performance
> debugging.
>
> The system is a simple topic-based publish-subscribe messaging system,
> and the workload we're issuing is a handful of publishers publishing
> messages to the server, all on separate hosts on a GigE LAN. Each of
> the publishers tops out at just 1000 messages per second, where each
> message is 1KB in size. However, we can keep piling on clients, and the
> throughput scales up linearly.
>
> From this info, one simple explanation would be that the culprit is a
> bottleneck in the client. Yet the strange thing is that the CPU
> utilization of each client is just ~5%. On the server, CPU utilization
> hovers at ~20% when presented with a single publisher, and grows another
> ~20% for each additional publisher. The bottom line is that we're being
> held back well before full CPU or network saturation.
>
> Is there any synchronization in the reactor core of Netty that could be
> causing this per-socket bottleneck? Thanks in advance for any hints.
Along these lines, is there any documentation on what the
threading/concurrency architecture of Netty looks like? It has a pool
of NioWorkers that it splays onto an executor thread pool, but beyond
that things are murky. Here's what we've learned so far from the source:
After calling ServerBootstrap.bind(), Netty starts a boss thread that
just accepts new connections and registers them with one of the workers
from the worker pool in round-robin fashion (pool size defaults to CPU
count). Registration just pushes a new register task for a selector
loop to handle. All workers, and the boss, are executing via the
executor thread pool; hence, the executor must support at least two
simultaneous threads.
The workers take turns running the select loop, which also handles other
tasks, like register tasks (for these, the selector is properly woken
up). As far as I can tell, a worker continues running a loop so long as
there are interested fd's/keys (i.e. forever).
Furthermore, events seem to be handled in the same thread, via
processSelectedKeys() -> read()/write(). This would all suggest that
everything is running in the same thread - which of course can't be the
case. Thanks in advance for any clarification.
--
Yang Zhang
http://www.mit.edu/~y_z/
More information about the netty-users
mailing list