We have rejected the possibility of the problem being related to JGroups, as when running then same configuration locally (not on the amazon e2).

Let me outline the testing more specifically:

I have created a very simple socket client and server to communicate with infinispan nodes. This provides a mechanism to connect, send get and insert commands coupled with the required data to the targeted infinispan nodes. These insertions and retrievals are then timed from the client. As it stands this system works perfectly in a local environment on my own network. However as soon we attempt to test on the amazon e2 cloud, which is required for benchmarking against other products, the retrieval times jump from under a millisecond to around 160ms dependent on the value size number of nodes in the cluster.

The reason we are testing using this client -> server model is that we are also testing concurrency, to see what happens when we send thousands of requests from different sources.

I have used TCPPing both locally and on the amazon cloud (as multi-casting is not allowed in this environment), and the results are exactly the same. Perfect numbers locally, bad numbers remotely. This is proving to be quite a mystery.

I have uploaded my client and server code online base code: http://pastebin.org/54960.

Any clues ?

On Wed, Nov 18, 2009 at 4:34 PM, Michael Lawson (mshindo) <michael@sphinix.com> wrote:
Are there any official socket clients available?


On Tue, Nov 17, 2009 at 11:40 PM, Manik Surtani <manik@jboss.org> wrote:

On 17 Nov 2009, at 04:54, Michael Lawson (mshindo) wrote:

The benchmarking in question is simple insertions and retrievals run via sockets, these benchmarks return better results when run on a local machine, however the testing in question is being done on the Amazon E2 cloud. Running on the E2 was a problem in itself, but I followed the instructions on a blog and used an xml file to configure the transport properties.

<config xmlns="urn:org:jgroups" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  xsi:schemaLocation="urn:org:jgroups file:schema/JGroups-2.8.xsd">

<TCP bind_port="7800" />
<TCPPING timeout="3000"
initial_hosts="${jgroups.tcpping.initial_hosts:10.209.166.79[7800],10.209.198.176[7800],10.208.199.223[7800],10.208.190.224[7800],10.208.70.112[7800]}"
port_range="1"
num_initial_members="3"/>
<MERGE2 max_interval="30000" min_interval="10000"/>
<FD_SOCK/>
<FD timeout="10000" max_tries="5" />
<VERIFY_SUSPECT timeout="1500" />
<pbcast.NAKACK
use_mcast_xmit="false" gc_lag="0"
retransmit_timeout="300,600,1200,2400,4800"
discard_delivered_msgs="true"/>
<UNICAST timeout="300,600,1200" />
<pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000" max_bytes="400000"/>
<pbcast.GMS print_local_addr="true" join_timeout="3000" view_bundling="true"/>
<FC max_credits="2000000" min_threshold="0.10"/>
<FRAG2 frag_size="60000" />
<pbcast.STREAMING_STATE_TRANSFER/>
</config>
I have a theory, that perhaps the introduction of TCPPING in the jgroups file is resulting in some form of polling before the actual get request is processed and returned. Could this be the case ?

It could be - JGroups also has an experimental protocol called S3_PING which could help.  


Another approach for discovery in an EC2 environment is to use a GossipRouter, but I'd give S3_PING a try first.

Cheers
Manik




On Tue, Nov 17, 2009 at 12:03 AM, Manik Surtani <manik@jboss.org> wrote:
Hi Michael

Could you please detail your benchmark test a bit more?  We have done some internal benchmarks as well and things do look significantly different.  Could you also tell us which version you have been benchmarking?  We've made some significant changes to DIST between CR1 and CR2 with regards to performance.

FYI, we use the CacheBenchFwk [1] to help benchmark stuff; you may find this useful too.

Cheers
Manik

[1] http://cachebenchfwk.sourceforge.net


On 15 Nov 2009, at 22:00, Michael Lawson (mshindo) wrote:

> Hi,
> I have been performing some benchmark testing on Infinispan Running in Distributed mode, with some unexpected results.
>
> For an insertion with a Key size of 100 Bytes, and Value size 100 Bytes, the insertion time was 0.13ms and retrieval was 128.06ms.
>
> Communication with the infinispan nodes is being done via a socket interface, using standard java serialization.
>
> The retrieval time is consistently high in comparison to other systems, and I am wondering whether there are some other benchmark reports floating around that I can compare results with.
>
> --
> Michael Lawson
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Manik Surtani
manik@jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org





_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev



--
Michael Lawson

_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Manik Surtani
Lead, Infinispan
Lead, JBoss Cache





_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev



--
Michael Lawson (mshindo)




--
Michael Lawson (mshindo)