Hi,
it's best to start by understanding how a read and a write operation
are different.
For a non-transactional synch read, if the entry happens to have a
copy on the local node the read is immediately satisfied with no need
for network round-trips.
That implies several things:
- read performance in the order of millions/second (assuming small entries)
- a hit-ratio of local cache of 100% for replication
- a hit-ratio of 66.6% for distribution (assuming 2 owners: each key
is stored on 2 nodes out of 3; this ratio improves by setting more
owners and gets worse when adding more nodes)
A non-transactional synch write however always requires at least one
network RPC (if not 2) to confirm safe storage on at least one
additional node (when not 2):
- you have 33% change to be the primary owner for each key: needs one
RPC to the secondary owner
- 33% to be the secondary owner: needs one RPC to the primary owner
- 33% (remaining cases): needs 2 RPCs: one to the primary and one to
the secondary
Each of these cases have slightly different expected performance
metrics, but whichever you pick, you should be in a range of about two
orders of magnitude higher latency than a read operation.
Now in your scenario, assuming your application does a balanced mix of
reads and writes, and having understood that write performance doesn't
change when switching from DIST to REPL, it's possible the read
operations are somewhat faster (33% in average), but their latency is
extremely lower than write operations latency so unless you do a lot
more reads than writes, it's possible that you don't perceive any
measurable difference.
The odds can change dramatically if you change some parameters
- if you have 100s of nodes then REPL will still have a local
hit-ratio of 100% while DIST tends to zero, making a read similar to a
write in term of performance.
- if you change the ratio of reads vs writes
- if you set number of owners to 3, and have 3 nodes, you essentially
get the same behaviour as REPL.
Also to keep in mind: DIST can store more data in memory and scales
writes linearly, while with REPL write operations require a network
RPC for each node in the grid.
So yes it's possible that in 3 nodes the difference isn't noticeable,
but picking the right cache mode depends on statistical information
about the behaviour of your application, or more simply if you have
latency requirements on reads which require REPL.
--Sanne
On 13 August 2013 05:29, Faseela K <faseela.k(a)ericsson.com> wrote:
Hi,
I am using infinispan 5.2.3.
My configuration is non-transactional, synchronous.
With this configuration, is my replication supposed to perform better than
distribution, for both reads and writes?
My Cluster Size requirement is 4 nodes.
And my application involves both reads and writes heavily.
For better performance, are there any suggestions on the clustering
modes/configurations?
All my tests show, replication having better performance than distribution for reads
as well as writes, with 4 nodes.
Thanks,
Faseela
-----Original Message-----
From: infinispan-dev-bounces(a)lists.jboss.org
[mailto:infinispan-dev-bounces@lists.jboss.org] On Behalf Of Radim Vansa
Sent: Monday, August 12, 2013 6:13 PM
To: infinispan-dev(a)lists.jboss.org
Subject: Re: [infinispan-dev] Recommended Cluster Size for Replication Mode
Hi,
which version exactly do you use, 5.2.x, 5.3.x or 6.0.x? In 5.2 the replication mode was
implemented separately from distribution mode and depending on your configuration (is it
non-transactional synchronous?) the message control flow could differ. Since 5.3
replication mode is implemented in the same manner and the results should be more
comparable.
I may be wrong here, but in 5.2.x concurrent writes to single key in non-transactional
mode could result in entries being out of sync on some nodes (the writes could arrive at
two nodes in different order). I think this cannot happen in >= 5.3 anymore.
Radim
On 08/12/2013 09:04 AM, Faseela K wrote:
> Hi,
>
> With a 3 node cluster, even for "WRITES" my replication performance is
better than distribution.
> That's why I came across this doubt.
> Could some body please clarify, why the behaviour is like this?
>
> Thanks,
> Faseela
>
> -----Original Message-----
> From: infinispan-dev-bounces(a)lists.jboss.org
> [mailto:infinispan-dev-bounces@lists.jboss.org] On Behalf Of Mircea
> Markus
> Sent: Friday, August 09, 2013 7:14 PM
> To: infinispan -Dev List
> Subject: Re: [infinispan-dev] Recommended Cluster Size for Replication
> Mode
>
> On 6 Aug 2013, at 15:19, Faseela K <faseela.k(a)ericsson.com> wrote:
>
>> What is the recommended cluster size for Replication Mode?
>> Given 3 nodes, My replication configuration performs better than my
distributed configuration.
>> Just wanted to know, at what cluster size, distribution will perform better
than replication.
> There's no straight answer, it depends on the read/write ratio and the amount of
data you store.
> Replication will always perform better for reads as it won't involve a remote
call to get the data.
> If you're mostly doing reads and your memory allows (replication is more memory
consuming) then you should use replication.
> If the amount of data increases or you're doing more writes, distribution is the
way to go.
>
> Cheers,
> --
> Mircea Markus
> Infinispan lead (
www.infinispan.org)
>
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev