]
Bela Ban commented on JGRP-2065:
--------------------------------
Created a bunch of additional bundlers: {{RingBufferBundler}},
{{RingBufferBundlerLockless}} and {{RingBufferBundlerLockless2}}.
This needs to be revisited in 4.1, but is postponed until then as I need to go back to
working on releasing 4.0 (and its API changes).
RoundTrip: latency is high compared to RoundTripTcp/RoundTripServer
-------------------------------------------------------------------
Key: JGRP-2065
URL:
https://issues.jboss.org/browse/JGRP-2065
Project: JGroups
Issue Type: Task
Reporter: Bela Ban
Assignee: Bela Ban
Fix For: 4.0
{{RoundTrip}} is a simple test between 2 members measuring round-trip latency. The sender
continually sends a message, the receiver receives it and sends the response, and the
sender unblocks when the response is received. Then the sender sends the next message.
The time for the request is then logged at the sender and a min/avg/max value is computed
(probably changed to histograms later).
{{RoundTrip}} uses a JGroups channel, configured with {{-props}}, e.g. {{-props
~/tcp.xml}}.
{{RoundTripTcp}} does the same, but uses direct TCP sockets (no JGroups) for
communication.
{{RoundTripServer}} uses the client-server classes of JGroups for communication, but no
channel is used.
Round trip times (both processes on the same box, a Mac mini):
* {{RoundTrip}} (with {{tcp.xml}} shipped with JGroups): *110 us*
* {{RoundTripTcp}}: *46 us*
* {{RoundTripServer}}: *49 us*
Note that the client/server classes used by {{RoundTripServer}} are also used by the TCP
transport (configured in {{tcp.xml}}.
{{RoundTripServer}} is ~6% slower than {{RoundTripTcp}}, but that can be attributed to
the additional work the former has to do (e.g. connection reaping etc). This is something
we can focus on later.
The big difference are the 110 us for {{RoundTrip}}. The goal is to find out what causes
this and eliminate it. Since {{RoundTrip}} and {{RoundTripServer}} use the same underlying
client/server classes in JGroups, let's compare these 2.
Tasks:
* Remove all protocols other than TCP in the running stack: (e.g. {{probe.sh
remove-protocol=MFC}}). I already did this and the diff was negligible, but let's run
this again
* Try various bundlers (e.g. NoBundler)
* Reduce threads in the threadpools, possibly disable (regular and OOB) thread pools
(replace with DirectExecutor)
* The default is 1 sender thread, but try with multiple threads
* Measure perf between sending a message (in the bundler) and receiving it