Hi all,
I have a really complex issue regarding the JMX timer and i could use some help. Based on
the architecture of the Simple Schedule Provider, I developed a version of a job
scheduling system for academic purposes.
This system consists of:
- an mbean provider which listens to a session EJB for registering job schedules at
runtime.
- a manager which contains the JMX Timer for triggering those schedules
- a listener class whose instances catch the triggering events.
- a JMS queue where we put the objects describing the logic of each schedule in order to
execute it.
- an MDB whose instances consume the objects in the queue and execute their corresponding
logic.
When the timer fires a schedule, the listener catches the event and places an object
message into the queue. An MDB instance is then used by the container to consume the
message by executing an interface predefined method on the enclosed object.
This system makes use of the JbossCache 1.4 (TreeCache) for persistence purposes because
it is supposed to be much faster than a traditional JDBC storage engine. Every time a
message object is placed in the queue, this information is stored in the cache. Its
meaning is that a job's repetition is under execution.
Our scheduling system can also work in a clustered Jboss environment, where the Provider ,
Manager (including JMX Timer) and JMS Queue act as singletons working on a single node
while several instantiated MDBs on every node of the cluster are ready to consume messages
pulled from the JMS queue through the cluster-wide JNDI. The cache also works in
synchronous replication mode. In this way we want to achieve a load balancing regarding
the load that comes from the processing of the actual logic of the various scheduled
jobs.
I wont refer to other issues as fail-over etc cause they are irrelevant with my current
subject but whoever wants more info on this can mail me.
Anyway, this system works just fine under normal conditions (delays lower than 100ms).
However, when we started stress testing it (more than 200 -300 job executions per second)
we noticed several strange issues.
I will demonstrate them through an actual example. Our jobs where simple java classes
inserting a row per each execution into an in-memory JDBC table (mysql). Each row consists
of timestamps (System.getTimeInMillis()) pulled at certain points of the executing
process, starting at the Timer trigger firing and ending at the time the actual job
started running.
Well, having these data, we noticed that delays of several seconds (up to 1 minute)
occurred solely because of the JMX Timer.
row example:
shouldFireAt: 100000ms
FiredAt: 120000ms;
CachedAt: 120010ms;
MDB consumed at : 120050ms;
Started executing : 120051ms;
As far as we know the java.util.Timer can support much more than 200-300 task executions
per second, jmx timer doesnt? Do these JMX Timer delays make sense to you? Keep in mind
that the cpu and memory usage never exceeded 70%.
And another even more strange issue is that when working in a cluster, the delays under
the same stress tests, go 300% higher! And still the only reason is the JMX timer and not
the cache synchronous replication, which is absolutely crazy. How can that be? The timer
is used as singleton and he has no knowledge himself that he is working in a cluster.
Shouldnt i notice exactly the same FiredAt - shouldFireAt delays as if I worked in a
single node environment?
I am ready to use Jprofiler to find some answers, but i am pretty sure that its the JMX
Timer's bad performance that degrades the whole system's performance.
I could use some advise here.
thx in advance.
TheHunt.
View the original post :
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3999437#...
Reply to the post :
http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&a...