[JBoss Cache] Document updated/added: "WhatShouldWeExpectOfThePojoCachePerformance"

Monday, 22 February 2010

User development,

The document "WhatShouldWeExpectOfThePojoCachePerformance", was updated Feb 22,
2010
by Manik Surtani.

To view the document, visit:
http://community.jboss.org/docs/DOC-12696#cf

Document:
--------------------------------------------------------------
h2. Performance we should expect from PojoCache 
Ben Wang, 05-2006

-
----
h3. Introduction
PojoCache (formerly called TreeCacheAop) is a key component in JBoss Cache. It is an
in-memory, replicated, and persistence cache system that operates directly on POJOs (Plain
Old Java Objects) in a distributed environment. That is, it is object-oriented such that
it can preserve object relationship during replication (or persistence). In addition, it
performs fine-grained replication transparently, meaning once a POJO is attached to the
cache system, any further POJO field update will trigger a corresponding replication
automatically. 

PojoCache also supports Java annotation as well. In the upcoming 1.4 release, for example,
there are two additional field-level annotations: @Transient and @Serializable to provide
options to skip field replication, or to treat a sub-object as Serializable (but still
maintain the external object relationship).

Users that are interested to know more details should refer the JBoss Cache online
documentation. There are examples that you can run in the JBoss Cache release distribution
as well. To see an interesting example of the usage, you can also refer to this
http://www.onjava.com/pub/a/onjava/2005/11/09/jboss-pojo-cache.html article.

In this article, we will compare the performance of PojoCache fine-grained replication
against TreeCache (a default plain cache component of JBoss Cache). As mentioned, with
automatic field-level replication, not only will PojoCache ease up the burden of and
simplify development (no more additional cache.put() at the end of modifications), it can
also potentially increase throughput as well, depending on the POJO size. Our objective
here then is to give the user a clear picture of the PojoCache performance characteristics
(by comparing to the TreeCache).

h3. Performance tester
We have created a performance tester to benchmark JBoss Cache. For those interested, the
script can be checked out from JBossCache cvs under tests/scripts/benchmark.sh. Basically,
a user can configure the number of nodes in the cluster, clients on each node, and payload
size (through a list size, explained below). In addition, there is a switch to use either
TreeCache (default to JBoss Cache) or PojoCache. If it is PojoCache, then there is another
option to specify the frequency of whole POJO updates (e.g., in PojoCache parlance, how
often is a new POJO will be attached to the cache).

As for the load pattern, basically, the tester resides in the same VM as the cache
instance, and each client will update a distinct fully qualified name (Fqn) repeatedly
non-stop and such that there won't be any write contention between the clients. This
is similar to the pattern of http session replication using sticky sessions. To minimize
the impact of CPU sharing by the loader client, we have pre-constructed the POJO before
the run.

h3. Test load pattern
The test POJOs are listed in the Appendix section. It consists of a Student class
(inherited from Person class) with an Address class and a list of Courses. To vary the the
request message size, we have chosen to parameterize the list size of courses.

For TreeCache test, we would simply perform the following code snippet for each client
loop:

cache.put(fqn, key, pojo);

where key is a thread id String, and pojo is a pre-constructed Student instance.

For PojoCache tests, depending on the update frequency, we do one of the following within
each client iteration:
1. 
pojocache.putObject(fqn, pojo);

2. 
pojo.getCourses().get(0).setInstructor("Ben Wang");

The first one maps a *new* POJO each time to the cache system. As a result, it will be
more expensive. Note that we are emphasizing a *new* pojo for every subsequent putObject
because if it is still the same POJO, PojoCache will simply recongize that and return the
instance right away. While this would be fast operation, it is not our test ojbective
here. 

The second one, as mentioned, will simply trigger a field replication autmatically. For
example, when going through the underlying replication, the field replication is doing an
equivalent of:

pojocache.put(fqn, key, "Ben Wang");

of which should be fairly efficient (about 125 bytes over the wire)!

h3. Replication message size
To give an idea of the actual message size used, here is a table of list size that we
employed compared to the actual message size during replication for TreeCache test.

| List size | Replicated message size (bytes) |
| 10 | 1224 |
| 100 | 8415 |
| 200 | 16341 |
Table 1. List size vs. replicated message size

As we can see, the size ranges from 1K bytes to 16K bytes. In addition, PojoCache when
doing a whole object update would have twice as much of the size because of extra metadata
and overhead involved. We have plan to reduce this size further in the future relase.

h3. Test Environment
We ran the tests using a 4-node cluster. The machines are connected with a Gigabit switch.
Here are more detailed info.

| Machine | Intel dual 3.0Ghz CPU with 4G RAM |
| OS | Linux |
| JBossCache | 1.4Beta |
Table 2. Testing environment

h3. Result

Typically, we expect the POJO would have a long lifetime residing in the POJO cache
system, since it is expected that the longer the POJO lifetime, the better the overall
throughput will be (becuase of more field updates). To illustrate the impact of the POJO
lifetime on the overall performance, we have chosen to vary the POJO update frequency
here.

In addition, we have studied the effects of replication message size against the cache
performance. We have chosen to vary the Course list size in the Student object. We have
run cases with list size of: 10, 100, and 200, respectively (see Table 1 for the actual
replicated size again).

We have compared the overall throughput and cpu utilization for 4 different cases:
* TreeCache. This is the plain cache with put of the POJO every time.
* PojoCache 100-0. PojoCache with a different POJO attachment every time. That is, there
is no fine-grained replication. This is 100% POJO update.
* PojoCache 10-90. PojoCache with 1 POJO atachment and 9 field updates. This is 10% POJO
update and 90% field replication.
* PojoCache 5-95. PojoCache with 1 POJO attachement and 19 field updates. This is 5% POJO
update and 95% field replication.

Figure 1 shows the overall throughput for the 4 different cases. From the figure, we can
see that TreeCache is about 4 times faster than PojoCache 100-0. While we expect every
PojoCache POJO update will be slower (becuase it needs to actively map the fields into the
cache system), further optimization is possible in the future release. 

With PojoCache 10-90 case, however, the overall throughput is about 2 times faster than
the TreeCache ones. That is, when the POJO update frequency is 10%, at list size of 100,
PojoCache is about as twice as fast as the TreeCache one. When the POJO update frequency
is only 5% (case PojoCache 5-95), it becomes about 3 times faster (fastest one is 3.5
times for list size of 200) than the TreeCache ones. Obviously, the longer the POJO
lifetime, the better performance for PojoCache as discussed.

In addition, we are seeing that the bigger the list size, the bigger the advantage that
PojoCache has. This is expected since TreeCache would have to serialize a bigger message
payload over the wire.

h3.  Note that some people have been mislead by some of the above statistics.  All of the
tests are performing 100% writes.  The 10-90 and 5-95 you see pertain to the ratio of
attaching pojos to modifying pojo fields, NOT a ratio of writes to reads.

!http://www.jboss.com/cluster/PojoCacheThroughput.png!.

Figure 1. Overall througput

Figure 2 shows the corresponding cpu utilization for the 4 different cases. As we can see,
the CPU utilization are somewhat similar for all test cases except PojoCache 100-0 where
every POJO update probably demands more CPU power.

!http://www.jboss.com/cluster/PojoCacheCPU.png!.
Figure 2. CPU utilization

Finally, the above tests all have been run using ASYNCHRONOUS replication mode. To
validate the results with SYNCHRONOUS replication, we have also shown the resulting
throughput using the list size of 100 in Table 3.

| Case | Asynchronous throughput (req/sec) | Synchronous throughput (req/sec) |
| TreeCache | 972 | 520 |
| PojoCache 100-0 | 230 | 130 |
| PojoCache 10-90 | 1815 | 1060 |
| PojoCache 5-95 | 2897 | 1600 |
Table 3. Throughput comparison for different cache modes with list size of 100.

As we can see, except the overall throughput for synchronous replication is slower than
the asynchronous one, the trend is about the same. E.g., PojoCache 5-95 is about 3 times
as fast as the TreeCache one.

h3. Conclusion
In this article, we have compared the performance of PojoCache against the plain TreeCache
(both components available in JBoss Cache). It is expected that without fine-grained
field-level replication, PojoCache is slower than TreeCache. However, with field
replication, PojoCache can be 3 times faster than the TreeCache counterpart with cases
shown here. Of course, besides the performance factor, PojoCache also comes with automic
POJO field replication and capability of handling POJO object relationships (during
replication).

For additional feedback, please go to this
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3943469#... under
JBoss Cache Forum.

h3. Apendix

h4. Test POJO

/**
 * Person class with PojoCache declaration.
 */
@org.jboss.cache.aop.annotation.InstanceOfPojoCacheable
// Note that for PojoCache it is actually not necessary to implement Serialziable. It is
// needed for TreeCache
public class Person  implements java.io.Serializable 
{
   protected String name;
   protected Address address;
}

/**
 * Student class. No need to declare annotation since it inherits Student.
 */
public class Student extends Person
{
   protected String school;
   // We will vary this list size to run different replication message sizes.
   protected List courses = new ArrayList();
}

/**
 * Address class with PojoCache declaration.
 */
@org.jboss.cache.aop.annotation.PojoCacheable
public class Address implements java.io.Serializable {
   protected String city;
   protected int zip;
   protected String street;
}

/**
 * Course class with PojoCache declaration.
 */
@org.jboss.cache.aop.annotation.PojoCacheable
public class Course  implements java.io.Serializable {
   protected String title;
   protected String instructor;
   protected String room;
}

Note that the POJOs are declared with JDK5.0 annotations. Before we run the PojoCache
tests, we have to run the aopc ant target to prepare the POJOs once. Then during runtime,
there is no special class loader needed.

*Referenced by:*

--------------------------------------------------------------

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006