[undertow-dev] Websocket connections and IO thread affinity

Jason Greene jason.greene at redhat.com
Thu Jun 16 15:59:47 EDT 2016


Are you using WildFly or Undertow standalone? 

If you are using undertow standalone, you might want to try enabling dispatch to worker (this is the default on WildFly):
webSocketDeploymentInfo.setDispatchToWorkerThread(true)

If you have message handlers that use significant CPU time or introduce blocking (which a disparity like you see could potentially indicate), then it can negatively impact the I/O thread’s ability to handle connection events efficiently, and so dispatching to the worker pool allows for long running tasks to execute without interfering with other connections/activity. 

> On Jun 16, 2016, at 2:36 PM, peter royal <peter.royal at pobox.com> wrote:
> 
> Understood. 
> 
> I'm going to test with increased IO threads, and if that fixes things
> I'm good. Using thread user CPU time might be a good metric, as looking
> at that the imbalance is clear:
> 
> CPU: 2673514 ms
> CPU: 31270 ms
> CPU: 61962 ms
> CPU: 7952561 ms
> 
> As I think through this more, optimal balancing requires pushing a lot
> of application-specific info down low, because a given WS connection
> might be high volume or not. it would be easier to migrate a connection
> that is detected to be high volume to another IO thread, but that'd be a
> hugely invasive change. The optimal strategy for me might just be to
> have 1 thread per connection as the counts aren't very high.
> 
> Thanks for the help!
> 
> -- 
> (peter.royal|osi)@pobox.com - http://fotap.org/~osi
> 
> On Thu, Jun 16, 2016, at 02:17 PM, Jason Greene wrote:
>> The way our current approach works, which is the same approach as
>> SO_REUSEPORT’s impl is that address:port are hashed to select the
>> destination, this is mainly so we can transition with no real behavioral
>> surprises. If you have some connections lasting significantly longer than
>> others, then you will eventually go out of balance because the current
>> allocation state isn’t a factor into the decision. It’s possible to do
>> more advanced algorithms factoring in state, but once you do that you tie
>> yourself to a single threaded acceptor (although thats currently the case
>> with our emulated SO_REUSEPORT implementation). For many workloads this
>> won’t matter though, as you need massive connection rates to hit the
>> accept stability limits.
>> 
>> Maybe you want to play with modifying QueuedTcpNioServer to compare a few
>> different algorithms? You could try balancing active connection count as
>> one strategy, and perhaps thread user cpu time as another. For both
>> approaches you probably want to have i/o threads individually updating a
>> volatile statistic field as part of their standard work, and then the
>> accept queuing thread scanning those values to select the best
>> destination.
>> 
>>> On Jun 16, 2016, at 2:01 PM, peter royal <peter.royal at pobox.com> wrote:
>>> 
>>> Gotcha. I was digging through things and found the change where the new
>>> strategy was introduced. With my current # of IO threads it is giving
>>> un-even weighings:
>>> 
>>> thread, connections
>>> 0, 6
>>> 1, 5
>>> 2, 3
>>> 3, 2
>>> 
>>> I'm going to double my IO threads, but it will still be less than
>>> optimal, but improved:
>>> 
>>> thread, connections
>>> 0, 2
>>> 1, 1
>>> 2, 1
>>> 3, 1
>>> 4, 4
>>> 5, 4
>>> 6, 2
>>> 7, 1
>>> 
>>> Random is only slightly better, eyeballing things.
>>> 
>>> I'm using Undertow 1.3.22 which uses XNIO 3.3.6.  Linux kernel 2.6.32
>>> though.
>>> 
>>> Digging into my problem more, I would probably need to balance on more
>>> than just connection count per IO thread, as some connections are busier
>>> than others. 
>>> 
>>> Can you point me towards any references about the forthcoming access to
>>> native facility? I'm curious as to how that will work
>>> 
>>> -pete
>>> 
>>> -- 
>>> (peter.royal|osi)@pobox.com - http://fotap.org/~osi
>>> 
>>> On Thu, Jun 16, 2016, at 01:41 PM, Jason T. Greene wrote:
>>>> We recently changed xnio to balance connections by default using a
>>>> strategy similar to the new SO_REUSEPORT facility in the Linux kernel
>>>> (3.3.3 or later). In the short future,  we will be switching to the
>>>> native facility when accessible in the JDK NIO implementation. Older
>>>> versions had a feature called balancing tokens that you could use to
>>>> balance connections fairly, but it had to be especially configured.  
>>>> 
>>>> 
>>>>> On Jun 16, 2016, at 1:00 PM, peter royal <peter.royal at pobox.com> wrote:
>>>>> 
>>>>> (I believe the following is true... please correct me if not!)
>>>>> 
>>>>> I have an application which heavily utilizes web sockets. It is an
>>>>> internal application which uses a small number of connections with
>>>>> reasonable load on each.
>>>>> 
>>>>> When a new connection is received by Undertow, there is an
>>>>> at-connection-time assignment of an XNIO IO Thread to the connection.
>>>>> This is causing uneven load on my IO threads, due to chance. 
>>>>> 
>>>>> I'm increasing the number of IO threads as a temporary fix, but it might
>>>>> be useful to be able to either migrate a long-lived connection to
>>>>> another IO thread (harder) or do better load balancing amongst IO
>>>>> threads. For the latter, if Undertow was able to provide a strategy for
>>>>> picking a thread in NioXnioWorker.getIoThread(hashCode), it could try
>>>>> and pick a thread that had fewer connections assigned to it.
>>>>> 
>>>>> Has anyone else run into this problem? Would a fix be accepted?
>>>>> 
>>>>> -pete
>>>>> 
>>>>> -- 
>>>>> (peter.royal|osi)@pobox.com - http://fotap.org/~osi
>>>>> _______________________________________________
>>>>> undertow-dev mailing list
>>>>> undertow-dev at lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/undertow-dev
>> 
>> --
>> Jason T. Greene
>> WildFly Lead / JBoss EAP Platform Architect
>> JBoss, a division of Red Hat
>> 

--
Jason T. Greene
WildFly Lead / JBoss EAP Platform Architect
JBoss, a division of Red Hat




More information about the undertow-dev mailing list