[JBoss JIRA] (JGRP-2504) Poor throughput over high latency TCP connection when recv_buf_size is configured
by Bela Ban (Jira)
[ https://issues.redhat.com/browse/JGRP-2504?page=com.atlassian.jira.plugin... ]
Bela Ban commented on JGRP-2504:
--------------------------------
Thanks for the detailed report! I wish every bug report was as succinct and precise as this one! I'll take a look tomorrow.
Cheers,
> Poor throughput over high latency TCP connection when recv_buf_size is configured
> ---------------------------------------------------------------------------------
>
> Key: JGRP-2504
> URL: https://issues.redhat.com/browse/JGRP-2504
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 5.0.0.Final
> Reporter: Andrew Skalski
> Assignee: Bela Ban
> Priority: Minor
> Fix For: 5.1
>
> Attachments: SpeedTest.java
>
>
> I recently finished troubleshooting a unidirectional throughput bottleneck involving a JGroups application (Infinispan) communicating over a high-latency (~45 milliseconds) TCP connection.
> The root cause was JGroups improperly configuring the receive/send buffers on the listening socket. According to the tcp(7) man page:
> {code:java}
> On individual connections, the socket buffer size must be set prior to
> the listen(2) or connect(2) calls in order to have it take effect.
> {code}
> However, JGroups does not set the buffer size on the listening side until after accept().
> The result is poor throughput when sending data from client (connecting side) to server (listening side.) Because the issue is a too-small TCP receive window, throughput is ultimately latency-bound.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
3 years, 7 months
[JBoss JIRA] (JGRP-2504) Poor throughput over high latency TCP connection when recv_buf_size is configured
by Bela Ban (Jira)
[ https://issues.redhat.com/browse/JGRP-2504?page=com.atlassian.jira.plugin... ]
Bela Ban updated JGRP-2504:
---------------------------
Fix Version/s: 5.1
> Poor throughput over high latency TCP connection when recv_buf_size is configured
> ---------------------------------------------------------------------------------
>
> Key: JGRP-2504
> URL: https://issues.redhat.com/browse/JGRP-2504
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 5.0.0.Final
> Reporter: Andrew Skalski
> Assignee: Bela Ban
> Priority: Minor
> Fix For: 5.1
>
> Attachments: SpeedTest.java
>
>
> I recently finished troubleshooting a unidirectional throughput bottleneck involving a JGroups application (Infinispan) communicating over a high-latency (~45 milliseconds) TCP connection.
> The root cause was JGroups improperly configuring the receive/send buffers on the listening socket. According to the tcp(7) man page:
> {code:java}
> On individual connections, the socket buffer size must be set prior to
> the listen(2) or connect(2) calls in order to have it take effect.
> {code}
> However, JGroups does not set the buffer size on the listening side until after accept().
> The result is poor throughput when sending data from client (connecting side) to server (listening side.) Because the issue is a too-small TCP receive window, throughput is ultimately latency-bound.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
3 years, 7 months
[JBoss JIRA] (JGRP-2504) Poor throughput over high latency TCP connection when recv_buf_size is configured
by Andrew Skalski (Jira)
Andrew Skalski created JGRP-2504:
------------------------------------
Summary: Poor throughput over high latency TCP connection when recv_buf_size is configured
Key: JGRP-2504
URL: https://issues.redhat.com/browse/JGRP-2504
Project: JGroups
Issue Type: Bug
Affects Versions: 5.0.0.Final
Reporter: Andrew Skalski
Assignee: Bela Ban
Attachments: SpeedTest.java
I recently finished troubleshooting a unidirectional throughput bottleneck involving a JGroups application (Infinispan) communicating over a high-latency (~45 milliseconds) TCP connection.
The root cause was JGroups improperly configuring the receive/send buffers on the listening socket. According to the tcp(7) man page:
{code:java}
On individual connections, the socket buffer size must be set prior to
the listen(2) or connect(2) calls in order to have it take effect.
{code}
However, JGroups does not set the buffer size on the listening side until after accept().
The result is poor throughput when sending data from client (connecting side) to server (listening side.) Because the issue is a too-small TCP receive window, throughput is ultimately latency-bound.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
3 years, 7 months
[JBoss JIRA] (WFLY-13897) infinispan-server instances provisioned by testsuite never shutdown
by Radoslav Husar (Jira)
[ https://issues.redhat.com/browse/WFLY-13897?page=com.atlassian.jira.plugi... ]
Radoslav Husar edited comment on WFLY-13897 at 9/29/20 1:47 PM:
----------------------------------------------------------------
There seem to be couple of problems at play here:
1. The graceful shutdown doesn't work because the client doesn't authenticate, because there is no such configuration - Jira TBA.
2. When graceful shutdown doesn't success, PID is not obtained on JDK8 – also caused by ISPN-12366.
3. When PID is obtained on JDK11, the kill command is incorrect.
4. If that is fixed - the kill command only kills the parent process (sh) and not the child process (java = the server).
was (Author: rhusar):
There seem to be couple of problems at play here:
1. The graceful shutdown doesn't work because the client doesn't authenticate, because there is no such configuration - Jira TBA.
2. When graceful shutdown doesn't success, PID is not obtained on JDK8 – also caused by ISPN-12366.
3. When PID is obtained on JDK11, the kill command is incorrect.
4. If that is fixed - the kill command only kills the parent process and not the child process.
> infinispan-server instances provisioned by testsuite never shutdown
> -------------------------------------------------------------------
>
> Key: WFLY-13897
> URL: https://issues.redhat.com/browse/WFLY-13897
> Project: WildFly
> Issue Type: Bug
> Components: Clustering, Test Suite
> Affects Versions: 21.0.0.Beta1
> Reporter: Paul Ferraro
> Assignee: Radoslav Husar
> Priority: Critical
>
> Running the clustering testsuite locally leaves 9 instances of infinispan-server running. This wreaks havoc on the CI, which will remain running until the VM shuts down.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
3 years, 7 months
[JBoss JIRA] (WFLY-13897) infinispan-server instances provisioned by testsuite never shutdown
by Radoslav Husar (Jira)
[ https://issues.redhat.com/browse/WFLY-13897?page=com.atlassian.jira.plugi... ]
Radoslav Husar commented on WFLY-13897:
---------------------------------------
There seem to be couple of problems at play here:
1. The graceful shutdown doesn't work because the client doesn't authenticate, because there is no such configuration - Jira TBA.
2. When graceful shutdown doesn't success, PID is not obtained on JDK8 – also caused by ISPN-12366.
3. When PID is obtained on JDK11, the kill command is incorrect.
4. If that is fixed - the kill command only kills the parent process and not the child process.
> infinispan-server instances provisioned by testsuite never shutdown
> -------------------------------------------------------------------
>
> Key: WFLY-13897
> URL: https://issues.redhat.com/browse/WFLY-13897
> Project: WildFly
> Issue Type: Bug
> Components: Clustering, Test Suite
> Affects Versions: 21.0.0.Beta1
> Reporter: Paul Ferraro
> Assignee: Radoslav Husar
> Priority: Critical
>
> Running the clustering testsuite locally leaves 9 instances of infinispan-server running. This wreaks havoc on the CI, which will remain running until the VM shuts down.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
3 years, 7 months
[JBoss JIRA] (WFLY-13897) infinispan-server instances provisioned by testsuite never shutdown
by Radoslav Husar (Jira)
[ https://issues.redhat.com/browse/WFLY-13897?page=com.atlassian.jira.plugi... ]
Radoslav Husar commented on WFLY-13897:
---------------------------------------
There are other processes leaking as well, I see after running the suite:
{noformat}
51983 s002 S 0:00.00 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
52246 s002 S 0:00.01 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
52953 s002 S 0:00.01 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
53280 s002 S 0:00.01 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
53924 s002 S 0:00.01 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
54540 s002 S 0:00.00 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
54937 s002 S 0:00.00 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
55246 s002 S 0:00.01 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
55874 s002 S 0:00.01 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
73893 s002 S 0:00.00 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
74090 s002 S 0:00.00 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
74345 s002 S 0:00.01 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
77211 s002 S 0:00.00 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
77571 s002 S 0:00.00 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
78067 s002 S 0:00.01 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
78473 s002 S 0:00.01 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
83664 s002 S 0:00.01 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
84044 s002 S 0:00.01 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
84562 s002 S 0:00.01 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
85278 s002 S 0:00.01 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
86133 s002 S 0:00.01 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
87041 s002 S 0:00.01 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
96331 s002 S 0:00.00 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
96494 s002 S 0:00.01 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
96731 s002 S 0:00.01 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
96924 s002 S 0:00.01 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
97239 s002 S 0:00.01 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
97656 s002 S 0:00.01 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
97932 s002 S 0:00.01 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
98136 s002 S 0:00.01 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
98458 s002 S 0:00.01 tail -f /Users/rhusar/git/wildfly/testsuite/integration/clustering/target/infinispan-server-11.0.3.Final/server/log/server.log
{noformat}
> infinispan-server instances provisioned by testsuite never shutdown
> -------------------------------------------------------------------
>
> Key: WFLY-13897
> URL: https://issues.redhat.com/browse/WFLY-13897
> Project: WildFly
> Issue Type: Bug
> Components: Clustering, Test Suite
> Affects Versions: 21.0.0.Beta1
> Reporter: Paul Ferraro
> Assignee: Radoslav Husar
> Priority: Critical
>
> Running the clustering testsuite locally leaves 9 instances of infinispan-server running. This wreaks havoc on the CI, which will remain running until the VM shuts down.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
3 years, 7 months