[jbpm-dev] [jBPM Development] - Re: Change caching strategy from nonstrict-read-write to tra

the_olo do-not-reply at jboss.com
Tue Sep 15 06:54:19 EDT 2009


"tom.baeyens at jboss.com" wrote : changing caching strategy from nonstrict-read-write  to transactional doesn't seem to be necessary to me.
  | 
  | in jBPM 3, we only cached process DEFINITION data.  which is assumed to be static in the DB.  so the idea is that this can be cached in read only mode.  the reason why we used nonstrict-read-write instead of read-only is to allow for new process definitions to be deployed (read: inserted)

Exactly for this reason we need write-capable and cluster-wide consistent cache.

We're going to hot redeploy business processes - otherwise, what's the point of the whole business process engine, if we can just as well write some logic and deploy the new EAR (ok, ok, ready framework for keeping the state of long running processes, process abstraction that it imposes and other stuff is good to have, but hot deployment is the killer feature for many).

I've investigated the issue a bit further and it turns out that the problem lies in some lack of coordination between jBPM-jPDL and jBPM-BPEL development, coupled with a Hibernate bug and general quirks of Hibernate 2nd level cache.

In jBPM-BPEL 1.1.GA release, there was jBPM-jPDL version 3.2.2 embedded. It contained all its cache configuration in .hbm.xml files.

The per class HBM files located in jbpm-jpdl.jar, bundled with jBPM-BPEL had the cache hardcoded to nonstrict-read-write.

This is the caching strategy that works with all 2nd level cache implementations apart from JBoss's TreeCache and JBoss Cache 2 (http://docs.jboss.org/hibernate/stable/core/reference/en/html/performance.html#performance-cache-compat-matrix).

It seems that between versions 3.2.2 and 3.2.3, jBPM-jPDL people have decided to centralize second level cache configuration and remove it from individual HBM files, placing them centrally in the hibernate config file (config/hibernate.cfg.xml), after the mapping section.

I can't point to an URL for the exact distributed file (you can download jbpm-jpdl-3.2.3 release and have a look yourself), but an approximate version can be seen here since it has been embedded in seam:
http://svn.apache.org/repos/asf/myfaces/tobago/trunk/example/seam/src/main/resources/hibernate.cfg.xml

As you can see, the file contains the mapping declarations, sourced from individual HBM files, then goes on to specify cache settings for the mapped classes and collections.

Now, two things went wrong here:

1) It seems that Alejandro Guizar, when incorporating jbpm-jpdl-3.2.3 into the next release jbpm-bpel-1.1.1 (https://jira.jboss.org/jira/browse/BPEL-297), hasn't noticed the caching change. This has resulted in jbpm-bpel-1.1.1 running totally cache-less with respect to jPDL entities. We've observed a ten-fold performance drop when upgrading from jbpm-bpel-1.1.GA to jbpm-bpel-1.1.1. Process execution speed has literally dropped through the floor.

Only after adding the cache tags back to HBM files in jbpm-jpdl.jar (one by one, since the HBM files for jPDL 3.2.3 contained some vital modifications unrelated to 2nd level cache), we got the engine's performance back to normal.

One might think that it would be a lot easier to simply copy the centralized cache settings from jbpm-jpdl-3.2.3 hibernate.cfg.xml to jbpm.hibernate.cfg.xml in jbpm-bpel-1.1.1. It's not so simple, unfortunately, due to the problem no.:

2) Due to a Hibernate bug (http://opensource.atlassian.com/projects/hibernate/browse/HHH-2808 - rejected BTW by Hibernate devs), one cannot specify collection cache settings in the Hibernate session factory configuration file.

The elements are there in the DTD (see http://hibernate.sourceforge.net/hibernate-configuration-3.0.dtd), but using them on a collection mapped in a subclass simply results in an exception: "org.hibernate.MappingException: Cannot cache an unknown collection". 

So the example configuration distributed with jbpm-jpdl-3.2.3 won't work for jbpm-bpel-1.1.1 anyway and unless HHH-2808 issued gets fixed, one has to resort to modifying the jPDL HBM files by hand.

This also makes clustering the business process engine much harder, since in order to guarantee consistency of hot-deployed business processes throughout the cluster, one needs to employ a replicated second level cache, which requires changing the caching strategy from nonstrict-read-write to transactional, and doing this across dozens of .hbm.xml files packed in a .jar is far from perfect.

The method we employed was to unpack the jar, run a find|perl one liner that did the substitution, then jar it up again.

"tom.baeyens at jboss.com" wrote : in jBPM 4, the process definitions are cached in memory by jBPM itself.  so there we don't even have hibernate second level cache configurations at all.
  | 
  | we didn't specify 2nd level cache configurations on the runtime data.  not in jBPM 3 and not in jBPM 4.  that is a topic we could explore.  but it doesn't have a priority for us at this time.  in that case i guess the cache must be configured as transactional as the runtime data is not read only (of course)

Based on our testing, we can say that properly caching the definition data helps a lot - improves performance by an order of magnitude.
Cache for runtime might help, but first and foremost, I don't think that definition data can be treated as read only, since the point of a business process engine is to gain flexibility in the world of inevitable change, which concerns business processes.

We cannot treat processes as static data.


View the original post : http://www.jboss.org/index.html?module=bb&op=viewtopic&p=4255256#4255256

Reply to the post : http://www.jboss.org/index.html?module=bb&op=posting&mode=reply&p=4255256


More information about the jbpm-dev mailing list