Are you using WildFly or Undertow standalone?
If you are using undertow standalone, you might want to try enabling dispatch to worker
(this is the default on WildFly):
webSocketDeploymentInfo.setDispatchToWorkerThread(true)
If you have message handlers that use significant CPU time or introduce blocking (which a
disparity like you see could potentially indicate), then it can negatively impact the I/O
thread’s ability to handle connection events efficiently, and so dispatching to the worker
pool allows for long running tasks to execute without interfering with other
connections/activity.
On Jun 16, 2016, at 2:36 PM, peter royal
<peter.royal(a)pobox.com> wrote:
Understood.
I'm going to test with increased IO threads, and if that fixes things
I'm good. Using thread user CPU time might be a good metric, as looking
at that the imbalance is clear:
CPU: 2673514 ms
CPU: 31270 ms
CPU: 61962 ms
CPU: 7952561 ms
As I think through this more, optimal balancing requires pushing a lot
of application-specific info down low, because a given WS connection
might be high volume or not. it would be easier to migrate a connection
that is detected to be high volume to another IO thread, but that'd be a
hugely invasive change. The optimal strategy for me might just be to
have 1 thread per connection as the counts aren't very high.
Thanks for the help!
--
(peter.royal|osi)(a)pobox.com -
http://fotap.org/~osi
On Thu, Jun 16, 2016, at 02:17 PM, Jason Greene wrote:
> The way our current approach works, which is the same approach as
> SO_REUSEPORT’s impl is that address:port are hashed to select the
> destination, this is mainly so we can transition with no real behavioral
> surprises. If you have some connections lasting significantly longer than
> others, then you will eventually go out of balance because the current
> allocation state isn’t a factor into the decision. It’s possible to do
> more advanced algorithms factoring in state, but once you do that you tie
> yourself to a single threaded acceptor (although thats currently the case
> with our emulated SO_REUSEPORT implementation). For many workloads this
> won’t matter though, as you need massive connection rates to hit the
> accept stability limits.
>
> Maybe you want to play with modifying QueuedTcpNioServer to compare a few
> different algorithms? You could try balancing active connection count as
> one strategy, and perhaps thread user cpu time as another. For both
> approaches you probably want to have i/o threads individually updating a
> volatile statistic field as part of their standard work, and then the
> accept queuing thread scanning those values to select the best
> destination.
>
>> On Jun 16, 2016, at 2:01 PM, peter royal <peter.royal(a)pobox.com> wrote:
>>
>> Gotcha. I was digging through things and found the change where the new
>> strategy was introduced. With my current # of IO threads it is giving
>> un-even weighings:
>>
>> thread, connections
>> 0, 6
>> 1, 5
>> 2, 3
>> 3, 2
>>
>> I'm going to double my IO threads, but it will still be less than
>> optimal, but improved:
>>
>> thread, connections
>> 0, 2
>> 1, 1
>> 2, 1
>> 3, 1
>> 4, 4
>> 5, 4
>> 6, 2
>> 7, 1
>>
>> Random is only slightly better, eyeballing things.
>>
>> I'm using Undertow 1.3.22 which uses XNIO 3.3.6. Linux kernel 2.6.32
>> though.
>>
>> Digging into my problem more, I would probably need to balance on more
>> than just connection count per IO thread, as some connections are busier
>> than others.
>>
>> Can you point me towards any references about the forthcoming access to
>> native facility? I'm curious as to how that will work
>>
>> -pete
>>
>> --
>> (peter.royal|osi)(a)pobox.com -
http://fotap.org/~osi
>>
>> On Thu, Jun 16, 2016, at 01:41 PM, Jason T. Greene wrote:
>>> We recently changed xnio to balance connections by default using a
>>> strategy similar to the new SO_REUSEPORT facility in the Linux kernel
>>> (3.3.3 or later). In the short future, we will be switching to the
>>> native facility when accessible in the JDK NIO implementation. Older
>>> versions had a feature called balancing tokens that you could use to
>>> balance connections fairly, but it had to be especially configured.
>>>
>>>
>>>> On Jun 16, 2016, at 1:00 PM, peter royal <peter.royal(a)pobox.com>
wrote:
>>>>
>>>> (I believe the following is true... please correct me if not!)
>>>>
>>>> I have an application which heavily utilizes web sockets. It is an
>>>> internal application which uses a small number of connections with
>>>> reasonable load on each.
>>>>
>>>> When a new connection is received by Undertow, there is an
>>>> at-connection-time assignment of an XNIO IO Thread to the connection.
>>>> This is causing uneven load on my IO threads, due to chance.
>>>>
>>>> I'm increasing the number of IO threads as a temporary fix, but it
might
>>>> be useful to be able to either migrate a long-lived connection to
>>>> another IO thread (harder) or do better load balancing amongst IO
>>>> threads. For the latter, if Undertow was able to provide a strategy for
>>>> picking a thread in NioXnioWorker.getIoThread(hashCode), it could try
>>>> and pick a thread that had fewer connections assigned to it.
>>>>
>>>> Has anyone else run into this problem? Would a fix be accepted?
>>>>
>>>> -pete
>>>>
>>>> --
>>>> (peter.royal|osi)(a)pobox.com -
http://fotap.org/~osi
>>>> _______________________________________________
>>>> undertow-dev mailing list
>>>> undertow-dev(a)lists.jboss.org
>>>>
https://lists.jboss.org/mailman/listinfo/undertow-dev
>
> --
> Jason T. Greene
> WildFly Lead / JBoss EAP Platform Architect
> JBoss, a division of Red Hat
>
--
Jason T. Greene
WildFly Lead / JBoss EAP Platform Architect
JBoss, a division of Red Hat