[infinispan-dev] Infinispan xsite perf test demo

Sat Dec 15 04:54:16 EST 2012

Hi Sanne,

thanks ! My first YouTube video, so next time I might venture into 
editing the video... :-)
The video was initially meant to show the perf when adding 2 backup 
sites, but then (because I had to include config info) 'degenerated' 
into a tutorial on xsite. I know, bad job on that, but we should come up 
with a tutorial that does this in a much more comprehensive way...

Comments inline

On 12/15/12 2:01 AM, Sanne Grinovero wrote:
> That was very very nice to see.
>
> Assuming you also asked for feedback to improve this as a talk:
>
> 1# you stress several times that reads are going to be local, so very
> fast. I think you meant "local to the site" ? as some ~33% of entries
> will need to be fetched from peers on the same site.

Yes, 'local' always refers to the local site (cluster). The scenario 
we're looking at is dist-sync within the site and async xsite repl 
between sites. I should also have mentioned that latency within a site 
is very small (e.g. 0.05ms) whereas we might have up to 60 *ms* between 
sites.

> 2# you aren't actually running this on multiple sites are you?

Correct. This was all within our Boston lab, every node was running on a 
different box though. The ultimate goal is to inject latency into the 
system, e.g. using the DELAY protocol in the global cluster.

But as a first step, I wanted to get the base performance for xsite repl.

>   When pointing out the different IP addresses you say something about
> needing them to be different, but I didn't understand if you needed
> them different because they are in different places, or to separate
> otherwise local machines to have them appear as in different places.

The reason to separate mcast_addr/mcast_port was to simulate 3 sites on 
the same box. Had I not used different addresses/ports for the 3 sites, 
all nodes of the 3 sites would have found each other and formed a 
cluster of 9.

>
> 3# Since get operations are always local (site), they are as you say
> not meaningful for the benchmark; now since put operations are also
> not meaningful as it's async .. what is the benchmark measuring?

Well, in the meantime I modified the test and now we do support reads; 
you can define a read/write ratio.

This scenario mimics how a large prospective xsite customer will use 
xsite repl: dist-sync for intra-site and xsite async repl for inter-site 
communication. One thing we ran into was that there was a 20% perf 
degradation if we added async xsite repl *per site* even if that site 
was down ! The root cause was that async xsite repl does not mark a site 
as offline in Infinispan even if it is down in JGroups. This will get 
fixed in Infinispan and should increase async xsite repl for down sites, 
see [1] for details.

Note that the test can also measure sync xsite replication between 
sites; this is just a matter of configuring the cache differently. But 
as the scenario this will be used initially is async xseit repl, that's 
what we're focusing on for  now.

>
> 4# There seems to be some degree of redundancy when explaining
> LON/SFO/NYC setting as the local site vs the backup sites. Wouldn't it
> make more sense to be able to configure all backup sites the same and
> have it automatically ignore the "self" element as a backup site? So
> your script would only need to specify what the local site is. If that
> makes any sense it would even be nice to extend this to the IP
> addresses being defined in the zones area, so that they are applied
> both to the JGroups configuration for the local cluster and to the
> bridge configuration.

Regarding the mcast_addr/mcast_port settings: yes, I could have used 
only 1 config file (local.xml) and set these properties as variables.
I've already changed and committed this.

Regarding the setup of the self and backup sites: yes, this could have 
been done. Again, this is just a matter of setup and lazyness on my 
behalf :-)

>
> 5# I was initially surprised to see x-site configuration as part of a
> cache configuration; I understand the reasons for options like
> "strategy" which one might want to specify differently on each cache,
> but what about "take offline" ?

Take offline/online is currently available via a JMX operations. Taking 
a site offline is also done automatically, but currently only when xsite 
repl is *sync*. There's a JIRA that'll fix this for async xsite repl.

>   that sounds more something which
> should be globally managed at the channel level - not sure if in
> JGroups directly but if it's to be handled in Infinispan I would
> expect to have all caches use the same policy, consistent with FD.

Actually this doesn't use JGroups failure detection as we can't use it 
across sites. This is where the <takeOffline ...> element comes in (as I 
mentioned, currently only for async)

> Also it doesn't looks like you have much of a choice in to which sites
> you want to replicate, as relay is setup at the jgroups level so
> affecting all caches: is relay going to be ignored by caches having no
> x-site enabled?

Actually, you can define the backupSites, so LON could choose *not* to 
replicate to a backup site at all, and NYC could pick only SFO as backup 
site.

Yes, RELAY2 can be ignored on a per-message basis: we have a NO_RELAY 
flag that AFAIR Mircea uses to exclude certain messages from getting 
relayed.

Note that in the demo I defined xsite repl in the defaults section and 
clusteredCache simply used it. I can define an empty <sites/> inside 
clusteredCache if I don't want xsite repl for that particular cache. Or 
it could be done the other way round: don't define a default xsite 
config, but define it per cache that wants xsite repl.

> And is it going to be relayed only to one site if the
> Infinispan configuration lists a single site?

Yes

> Not sure if this makes any sense, I just found it contrasting with my
> naive expectations of how such a configuration would look like.
>
> thanks a lot, I hope this is proof enough that your video was pretty catchy :)

Thanks for the feedback !

[1] https://issues.jboss.org/browse/JGRP-1543

-- 
Bela Ban, JGroups lead (http://www.jgroups.org)