[
https://issues.redhat.com/browse/JGRP-2504?page=com.atlassian.jira.plugin...
]
Andrew Skalski commented on JGRP-2504:
--------------------------------------
This change doesn't appear to be working yet. I looked at what it's doing using
strace:
{code:java}
$ strace -f -o strace.out java
-Djgroups.tcpping.initial_hosts=jgroups-west[7800],jgroups-east[7800] -cp
jgroups-5.1.0.Alpha1-SNAPSHOT.jar:. SpeedTest <<< quit{code}
The receive buffer is still not being configured ahead of bind/listen:
{code:java}
7241 socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 9
7241 setsockopt(9, SOL_IPV6, IPV6_V6ONLY, [0], 4) = 0
7241 fcntl(9, F_GETFL) = 0x2 (flags O_RDWR)
7241 fcntl(9, F_SETFL, O_RDWR|O_NONBLOCK) = 0
7241 setsockopt(9, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
7241 pread64(4,
"\312\376\272\276\0\0\0007\0W\n\0002\0003\t\0\4\0004\v\0005\0006\7\0007\n\08\0"...,
1918, 9994511) = 1918
7241 pread64(4,
"\312\376\272\276\0\0\0007\0/\n\0\10\0$\t\0\7\0%\n\0\t\0&\n\0\t\0'\7\0"...,
985, 16509997) = 985
7241 pread64(4,
"\312\376\272\276\0\0\0007\1]\n\0Y\0\272\n\0\273\0\274\10\0\275\n\0}\0\276\t\0\30\0"...,
7728, 16686562) = 7728
7241 pread64(4,
"\312\376\272\276\0\0\0007\0\33\n\0\3\0\26\7\0\27\7\0\30\1\0\6<init>\1\0"...,
536, 16509461) = 536
7241 bind(9, {sa_family=AF_INET6, sin6_port=htons(7800), inet_pton(AF_INET6,
"::ffff:69.164.216.53", &sin6_addr), sin6_flowinfo=htonl(0),
sin6_scope_id=0}, 28) = 0
7241 listen(9, 50) = 0
{code}
Looking at the code, note [this
section|https://github.com/belaban/JGroups/blob/master/src/org/jgroups/pr...];
the TcpServer constructor (which tries to pass recv_buf_size along to
Util.createServerSocket) is run before recv_buf_size is set.
Addressing the question of why the buffer size needs to be configured on the listening
socket (rather than the accepted socket): Although I haven't yet read through the
kernel sources to fully understand what's going on, I believe at least part of it has
to do with negotiating the window scaling TCP option, which happens during connection
handshake.
By the way, if you don't have two geographically separated servers, you can create
artificial latency using the Linux Netem module. (The kernel module name is sch_netem and
is included in the kernel-modules-extra RPM; you will also need the iproute-tc RPM.) The
syntax is arcane, so I will attach a shell script to this ticket.
One other thing: I'm not sure where's the best place to report this, but the
"Bug Reports" page on
jgroups.org links to an old Jira URL at
jira.jboss.com
that no longer works.
Poor throughput over high latency TCP connection when recv_buf_size
is configured
---------------------------------------------------------------------------------
Key: JGRP-2504
URL:
https://issues.redhat.com/browse/JGRP-2504
Project: JGroups
Issue Type: Bug
Affects Versions: 5.0.0.Final
Reporter: Andrew Skalski
Assignee: Bela Ban
Priority: Minor
Fix For: 5.1
Attachments: SpeedTest.java, bla5.java, bla6.java, bla7.java
I recently finished troubleshooting a unidirectional throughput bottleneck involving a
JGroups application (Infinispan) communicating over a high-latency (~45 milliseconds) TCP
connection.
The root cause was JGroups improperly configuring the receive/send buffers on the
listening socket. According to the tcp(7) man page:
{code:java}
On individual connections, the socket buffer size must be set prior to
the listen(2) or connect(2) calls in order to have it take effect.
{code}
However, JGroups does not set the buffer size on the listening side until after
accept().
The result is poor throughput when sending data from client (connecting side) to server
(listening side.) Because the issue is a too-small TCP receive window, throughput is
ultimately latency-bound.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)