[jboss-jira] [JBoss JIRA] Updated: (JGRP-1223) STREAMING_STATE_TRANSFER abysmally slow

Kornelius Elstner (JIRA) jira-events at lists.jboss.org
Tue Jun 29 10:59:46 EDT 2010


     [ https://jira.jboss.org/browse/JGRP-1223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kornelius Elstner updated JGRP-1223:
------------------------------------

    Steps to Reproduce: 
Run two instances of StateTransferTest.java (see attached file) with the included config. After one minute the second instance will (typically) end up with an empty map:

INFO  JChannel - JGroups version: 2.8.0.GA

-------------------------------------------------------------------
GMS: address=WZUR4269297-5387, cluster=state-transfer-test, physical address=165.222.17.200:2599
-------------------------------------------------------------------
INFO  ReplicatedHashMap - state could not be retrieved (first member)
Loading state took: 60135ms
Shared map has 0 entries


With the included patch the state is loaded successfully:

INFO  JChannel - JGroups version: 2.8.0.GA

-------------------------------------------------------------------
GMS: address=WZUR4269297-19455, cluster=state-transfer-test, physical address=165.222.17.200:2654
-------------------------------------------------------------------
INFO  ReplicatedHashMap - state was retrieved successfully, waiting for setState()
INFO  ReplicatedHashMap - setState() was called
Loading state took: 1484ms
Shared map has 10000 entries


  was:
run two instances of:

======================= StateTransferTest.java: ===============================
import java.io.File;
import java.io.IOException;
import java.util.Random;

import org.jgroups.Channel;
import org.jgroups.ChannelException;
import org.jgroups.JChannel;
import org.jgroups.blocks.ReplicatedHashMap;
import org.jgroups.util.UUID;

public class StateTransferTest {
	private final ReplicatedHashMap<String, byte[]> sharedCache;
	private final Channel channel;

	private final Random random = new Random();

	private final boolean isCoordinator;

	private byte[] makeRandom(final int size) {
		final byte[] ret = new byte[size];

		random.nextBytes(ret);
		return ret;
	}

	public StateTransferTest() throws ChannelException {
		channel = new JChannel(new File("jgroups-config.xml"));
		channel.connect("state-transfer-test");
		isCoordinator = channel.getView().getMembers().get(0).equals(channel.getAddress());
		sharedCache = new ReplicatedHashMap<String, byte[]>(channel);
		sharedCache.setBlockingUpdates(true);
		final long startTime = System.currentTimeMillis();
		sharedCache.start(60000);
		final long endTime = System.currentTimeMillis();

		System.out.println("Loading state took: " + (endTime - startTime) + "ms");

		if (isCoordinator) {
			for (int i = 0; i < 10000; i++) {
				final byte[] sub = makeRandom(5000);
				sharedCache.put(UUID.randomUUID().toString(), sub);
			}
			System.out.println("Created shared data");
		}
		System.out.println("Shared map has " + sharedCache.size() + " entries");
	}

	private void stop() {
		channel.disconnect();
	}

	public static void main(final String[] args) throws IOException, ChannelException {
		final StateTransferTest stateTransferTest = new StateTransferTest();

		if (!stateTransferTest.isCoordinator) {
			stateTransferTest.stop();
		}
	}
}
===============================================================================

with the following config:
============================= jgroups-config.xml ==============================
<?xml version="1.0" encoding="UTF-8"?>
<config xmlns="urn:org:jgroups"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/JGroups-2.8.xsd">
    <UDP
         mcast_port="45000"
         tos="8"
         ucast_recv_buf_size="20M"
         ucast_send_buf_size="640K"
         mcast_recv_buf_size="25M"
         mcast_send_buf_size="640K"
         loopback="false"
         discard_incompatible_packets="true"
         max_bundle_size="64K"
         max_bundle_timeout="30"
         ip_ttl="2"
         enable_bundling="true"
         enable_diagnostics="true"
         thread_naming_pattern="cl"
         timer.num_threads="4"

         thread_pool.enabled="true"
         thread_pool.min_threads="2"
         thread_pool.max_threads="8"
         thread_pool.keep_alive_time="5000"
         thread_pool.queue_enabled="true"
         thread_pool.queue_max_size="10000"
         thread_pool.rejection_policy="discard"

         oob_thread_pool.enabled="true"
         oob_thread_pool.min_threads="1"
         oob_thread_pool.max_threads="8"
         oob_thread_pool.keep_alive_time="5000"
         oob_thread_pool.queue_enabled="false"
         oob_thread_pool.queue_max_size="100"
         oob_thread_pool.rejection_policy="Run"/>

    <PING timeout="2000"
            num_initial_members="3"/>
   	<MERGE2 max_interval="30000" min_interval="10000"/>
   	<FD_SOCK/>
   	<FD timeout="10000" max_tries="5"/>
   	<VERIFY_SUSPECT timeout="1500"/>
   	<BARRIER/>
   	<pbcast.NAKACK use_mcast_xmit="false" gc_lag="0"
   		retransmit_timeout="300,600,1200,2400,4800"
   		discard_delivered_msgs="true" />
   	<UNICAST timeout="300,600,1200" />
   	<pbcast.STABLE stability_delay="1000"
   		desired_avg_gossip="50000" max_bytes="400000" />
  		<VIEW_SYNC avg_send_interval="60000"/>
   	<pbcast.GMS print_local_addr="true" join_timeout="3000"
   		view_bundling="true" />
   	<FC max_credits="20000000" min_threshold="0.10" />
   	<FRAG2 frag_size="60000" />
   	<pbcast.STREAMING_STATE_TRANSFER />
</config>
===============================================================================

After one minute the second instance will (typically) end up with an empty map:

INFO  JChannel - JGroups version: 2.8.0.GA

-------------------------------------------------------------------
GMS: address=WZUR4269297-5387, cluster=state-transfer-test, physical address=165.222.17.200:2599
-------------------------------------------------------------------
INFO  ReplicatedHashMap - state could not be retrieved (first member)
Loading state took: 60135ms
Shared map has 0 entries


After applying the patch:

===============================================================================
--- JGroups-2.8.1.GA.src\src\org\jgroups\protocols\pbcast\STREAMING_STATE_TRANSFER.java	2010-04-30 13:52:36.000000000 +0200
+++ patch\src\org\jgroups\protocols\pbcast\STREAMING_STATE_TRANSFER.java	2010-06-29 16:43:45.753146800 +0200
@@ -724,17 +724,17 @@ public class STREAMING_STATE_TRANSFER ex
         }
 
         public void write(byte[] b, int off, int len) throws IOException {
-            super.write(b, off, len);
+            out.write(b, off, len);
             bytesWrittenCounter += len;
         }
 
         public void write(byte[] b) throws IOException {
-            super.write(b);
+            out.write(b);
             bytesWrittenCounter += b.length;
         }
 
         public void write(int b) throws IOException {
-            super.write(b);
+            out.write(b);
             bytesWrittenCounter += 1;
         }
     }
===============================================================================

The state is loaded successfully:

INFO  JChannel - JGroups version: 2.8.0.GA

-------------------------------------------------------------------
GMS: address=WZUR4269297-19455, cluster=state-transfer-test, physical address=165.222.17.200:2654
-------------------------------------------------------------------
INFO  ReplicatedHashMap - state was retrieved successfully, waiting for setState()
INFO  ReplicatedHashMap - setState() was called
Loading state took: 1484ms
Shared map has 10000 entries




> STREAMING_STATE_TRANSFER abysmally slow
> ---------------------------------------
>
>                 Key: JGRP-1223
>                 URL: https://jira.jboss.org/browse/JGRP-1223
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 2.8
>         Environment: Windows XP
>            Reporter: Kornelius Elstner
>            Assignee: Bela Ban
>         Attachments: JGRP-1223.zip
>
>
> With a relatively large shared state (anything in excess of a few MBytes) STREAMING_STATE_TRANSFER is very slow, it can take minutes for the state transfer to complete. The underlying issue is that STREAMING_STATE_TRANSFER.StreamingOutputStreamWrapper subclasses java.io.FilterOutputStream which always calls write(int b), even if buffers are to be written to the underlying socket.
> As a consequence each byte to be transferred results in a call to write() to the underlying socket.
> The fix is simple, the byte array variants of StreamingOutputStreamWrapper.write() should delegate to the wrapped socket directly.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the jboss-jira mailing list