[undertow-dev] Websocket connections and IO thread affinity

peter royal peter.royal at pobox.com
Thu Jun 16 16:15:54 EDT 2016


Undertow standalone. 

Thanks for the reminder about that - I went back and looked at things
and realized I was doing deserialization of incoming messages on the IO
thread prior to dispatching to the worker thread. The outbound path was
clean though. 

Thanks again!

-pete

-- 
(peter.royal|osi)@pobox.com - http://fotap.org/~osi

On Thu, Jun 16, 2016, at 02:59 PM, Jason Greene wrote:
> Are you using WildFly or Undertow standalone? 
> 
> If you are using undertow standalone, you might want to try enabling
> dispatch to worker (this is the default on WildFly):
> webSocketDeploymentInfo.setDispatchToWorkerThread(true)
> 
> If you have message handlers that use significant CPU time or introduce
> blocking (which a disparity like you see could potentially indicate),
> then it can negatively impact the I/O thread’s ability to handle
> connection events efficiently, and so dispatching to the worker pool
> allows for long running tasks to execute without interfering with other
> connections/activity. 
> 
> > On Jun 16, 2016, at 2:36 PM, peter royal <peter.royal at pobox.com> wrote:
> > 
> > Understood. 
> > 
> > I'm going to test with increased IO threads, and if that fixes things
> > I'm good. Using thread user CPU time might be a good metric, as looking
> > at that the imbalance is clear:
> > 
> > CPU: 2673514 ms
> > CPU: 31270 ms
> > CPU: 61962 ms
> > CPU: 7952561 ms
> > 
> > As I think through this more, optimal balancing requires pushing a lot
> > of application-specific info down low, because a given WS connection
> > might be high volume or not. it would be easier to migrate a connection
> > that is detected to be high volume to another IO thread, but that'd be a
> > hugely invasive change. The optimal strategy for me might just be to
> > have 1 thread per connection as the counts aren't very high.
> > 
> > Thanks for the help!
> > 
> > -- 
> > (peter.royal|osi)@pobox.com - http://fotap.org/~osi
> > 
> > On Thu, Jun 16, 2016, at 02:17 PM, Jason Greene wrote:
> >> The way our current approach works, which is the same approach as
> >> SO_REUSEPORT’s impl is that address:port are hashed to select the
> >> destination, this is mainly so we can transition with no real behavioral
> >> surprises. If you have some connections lasting significantly longer than
> >> others, then you will eventually go out of balance because the current
> >> allocation state isn’t a factor into the decision. It’s possible to do
> >> more advanced algorithms factoring in state, but once you do that you tie
> >> yourself to a single threaded acceptor (although thats currently the case
> >> with our emulated SO_REUSEPORT implementation). For many workloads this
> >> won’t matter though, as you need massive connection rates to hit the
> >> accept stability limits.
> >> 
> >> Maybe you want to play with modifying QueuedTcpNioServer to compare a few
> >> different algorithms? You could try balancing active connection count as
> >> one strategy, and perhaps thread user cpu time as another. For both
> >> approaches you probably want to have i/o threads individually updating a
> >> volatile statistic field as part of their standard work, and then the
> >> accept queuing thread scanning those values to select the best
> >> destination.
> >> 
> >>> On Jun 16, 2016, at 2:01 PM, peter royal <peter.royal at pobox.com> wrote:
> >>> 
> >>> Gotcha. I was digging through things and found the change where the new
> >>> strategy was introduced. With my current # of IO threads it is giving
> >>> un-even weighings:
> >>> 
> >>> thread, connections
> >>> 0, 6
> >>> 1, 5
> >>> 2, 3
> >>> 3, 2
> >>> 
> >>> I'm going to double my IO threads, but it will still be less than
> >>> optimal, but improved:
> >>> 
> >>> thread, connections
> >>> 0, 2
> >>> 1, 1
> >>> 2, 1
> >>> 3, 1
> >>> 4, 4
> >>> 5, 4
> >>> 6, 2
> >>> 7, 1
> >>> 
> >>> Random is only slightly better, eyeballing things.
> >>> 
> >>> I'm using Undertow 1.3.22 which uses XNIO 3.3.6.  Linux kernel 2.6.32
> >>> though.
> >>> 
> >>> Digging into my problem more, I would probably need to balance on more
> >>> than just connection count per IO thread, as some connections are busier
> >>> than others. 
> >>> 
> >>> Can you point me towards any references about the forthcoming access to
> >>> native facility? I'm curious as to how that will work
> >>> 
> >>> -pete
> >>> 
> >>> -- 
> >>> (peter.royal|osi)@pobox.com - http://fotap.org/~osi
> >>> 
> >>> On Thu, Jun 16, 2016, at 01:41 PM, Jason T. Greene wrote:
> >>>> We recently changed xnio to balance connections by default using a
> >>>> strategy similar to the new SO_REUSEPORT facility in the Linux kernel
> >>>> (3.3.3 or later). In the short future,  we will be switching to the
> >>>> native facility when accessible in the JDK NIO implementation. Older
> >>>> versions had a feature called balancing tokens that you could use to
> >>>> balance connections fairly, but it had to be especially configured.  
> >>>> 
> >>>> 
> >>>>> On Jun 16, 2016, at 1:00 PM, peter royal <peter.royal at pobox.com> wrote:
> >>>>> 
> >>>>> (I believe the following is true... please correct me if not!)
> >>>>> 
> >>>>> I have an application which heavily utilizes web sockets. It is an
> >>>>> internal application which uses a small number of connections with
> >>>>> reasonable load on each.
> >>>>> 
> >>>>> When a new connection is received by Undertow, there is an
> >>>>> at-connection-time assignment of an XNIO IO Thread to the connection.
> >>>>> This is causing uneven load on my IO threads, due to chance. 
> >>>>> 
> >>>>> I'm increasing the number of IO threads as a temporary fix, but it might
> >>>>> be useful to be able to either migrate a long-lived connection to
> >>>>> another IO thread (harder) or do better load balancing amongst IO
> >>>>> threads. For the latter, if Undertow was able to provide a strategy for
> >>>>> picking a thread in NioXnioWorker.getIoThread(hashCode), it could try
> >>>>> and pick a thread that had fewer connections assigned to it.
> >>>>> 
> >>>>> Has anyone else run into this problem? Would a fix be accepted?
> >>>>> 
> >>>>> -pete
> >>>>> 
> >>>>> -- 
> >>>>> (peter.royal|osi)@pobox.com - http://fotap.org/~osi
> >>>>> _______________________________________________
> >>>>> undertow-dev mailing list
> >>>>> undertow-dev at lists.jboss.org
> >>>>> https://lists.jboss.org/mailman/listinfo/undertow-dev
> >> 
> >> --
> >> Jason T. Greene
> >> WildFly Lead / JBoss EAP Platform Architect
> >> JBoss, a division of Red Hat
> >> 
> 
> --
> Jason T. Greene
> WildFly Lead / JBoss EAP Platform Architect
> JBoss, a division of Red Hat
> 



More information about the undertow-dev mailing list