jBPM Persistence race condition in EsbActionHandler under high load
-------------------------------------------------------------------
Key: JBESB-2204
URL:
https://jira.jboss.org/jira/browse/JBESB-2204
Project: JBoss ESB
Issue Type: Bug
Security Level: Public (Everyone can see)
Components: Process flow
Affects Versions: 4.4
Environment: AS 4.2.3.GA, jBPM 3.2.3.GA + ehcache, RHEL 5, x86_64, 1.6.0 Sun JVM,
Oracle 10g
Reporter: Damon Brown
Priority: Minor
Attachments: CallbackCommand.java.patch
Under extremely high load, the org.jboss.soa.esb.services.jbpm.cmd.CallbackCommand:90
cannot always locate the Token to signal for the target process. The exception is not
propagated to the CallbackQueue and the message is not retried. The target process is
left in a suspended state even though the ESB service has completed its action. The
failure rate is machine-load dependent (1 - 5% failure rate).
The root cause of the missing token is not able to be attributed to the jBPM hibernate
session directly. I extended
org.jboss.soa.esb.services.jbpm.actionhandlers.EsbActionHandler directly to forcibly flush
the current state of the token to the database. Within the same action, I queried for the
state of the object, which was returned successfully.
There appears to be a race condition from ESB service invocation (EsbActionHandler), token
persistence (hibernate flush), and command callback (CallbackCommand). The particular ESB
service I am invoking has a short suspense. Other ESB services, which take longer to
execute, do not exhibit this behavior. The behavior for the short-suspense service was
random, but repeatable.
The patch file attached to this issue attempts to resolve the issue by re-trying the token
look-up three times with an intermittent sleep. Within my environment, under high load
conditions, the token failed to resolve on the first try 1% - 5% of the time (98% average
success on first try), with a 100% success rate on the second try. The target process
instance is successfully signaled.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira