[jboss-jira] [JBoss JIRA] Commented: (JBPM-983) concurrent process execution fails
Edward Staub (JIRA)
jira-events at lists.jboss.org
Wed Sep 5 09:42:14 EDT 2007
[ http://jira.jboss.com/jira/browse/JBPM-983?page=comments#action_12375360 ]
Edward Staub commented on JBPM-983:
-----------------------------------
[ed]-- if a job is saved while other jobs are in process for the same Process Instance, the saved job may run immediately, ignoring the "exclusive" flag
[tom]: that should be added in the documentation as a limitation. see my comment on your proposed fix later in this post
[ed]--I believe this would mean that in general, it's invalid to use async/exclusive nodes anywhere in a child (forked) token if an async/exclusive node has previously been started somewhere.
That's a pretty broad limitation - is it what you meant?
Note that the current behavior results in a timing-dependent, hard-to-reproduce bug.
---------
[ed] when a ProcessState is completed on a job that started in the subprocess, a collision can occur with another thread concurrently running in the parent process, resulting in a StaleObjectStateException
[tom]: i don't really get this problem. and then i don't see why the staleobjectexception is a problem. that is to be expected when 2 independent threads operation on the same process instance runtime information item, no ?
[ed] The limitation in this case is that it is invalid to use a subprocess on a fork that may complete while active execution is happening on another branch of the fork. Is that what you mean to say? How, in practice, are you thinking that people would be able to use subprocesses on forks with this limitation?
----------
[ed] when a ProcessState launches a subprocess, and the subprocess immediately reaches a node with "async=exclusive", the node job may be picked up and started before the ProcessInstance object has been committed to the database. This has been noted before on the forum, but I can't find a JIRA item regarding it.
[tom]: isn't this due to your isolation level 0 ?
[ed] Yes. Are you saying that this should be a documented limitation also?
---------
[ed]
To fix these problems, we need a way to synchronize across threads and servers on ProcessInstances and Tokens .
You may want to factor out a "Concurrency Service" ...
[tom]: that is exactly what we don't want to do. we should only offer what we can do by synchronizing on the DB. that should give us enough features to work with. the downside is that some features may depend on isolation levels. that should be documented better imo.
[ed] I may have been unclear. I suspect that you thought I meant some scheme based on network communications (beyond that used by the DB and transaction manager). I didn't. I do, however, think that you should consider use of Java synchronization mechanisms when a single JBPM instance is in control of a database. I was trying to suggest that there might be a concurrency-service interface that is populated by a different class based on whether multiple JBPM instances (multiple servers or EARs) are being used, and on what isolation level is in use. When a single JBPM is in use, simple Java synchronization can be used to reliably support concurrent operations on even the wimpiest of databases. When multiple servers are used, database locks and transaction support can be used. With this kind of factoring, It would then be much easier to throw an exception or otherwise flag scenarios that are not supported in the given configuration, such as those discussed above.
.
I'm not at all suggesting that optimistic locking should be abandoned. But I do think that there seem to be a few spots that need a bit more to avoid difficult-to-resolve concurrency bugs.
.
On a tangent... it might be possible to provide some useful (if imperfect) static execution analysis if nodes that are implicitly asynchronous (task-like nodes that "wait" on an external event for completion) were marked as such. This might be implemented as a "isAsync()" method (returns yes/no/sometimes) that extends ActionHandler, or through some more general mechanism for ActionHandlers to contain metadata. I mention this mostly as something to think about for Tempranillo's "Action" interface. I mention this because some of the limitations we've talked about require analysis to detect, and are difficult to express precisely in human language.
> concurrent process execution fails
> ----------------------------------
>
> Key: JBPM-983
> URL: http://jira.jboss.com/jira/browse/JBPM-983
> Project: JBoss jBPM
> Issue Type: Bug
> Components: Core Engine
> Affects Versions: jBPM jPDL 3.2
> Environment: Hypersonic in-memory database, JobExecutor configured with 5 threads
> Reporter: Alexander Schlett
> Assigned To: Tom Baeyens
> Priority: Critical
> Fix For: jBPM jPDL 3.2.2
>
> Attachments: SimpleTest.java, SimpleTest.java, SimpleTest.java, SimpleTest.java, SimpleTest.java
>
>
> concurrent execution of async nodes with multiple JobExecutor threads fails. the effect is:
> 1) job sync within JobExecutor fails due to org.hibernate.StaleObjectStateException
> 2) process gets stuck in join node and never ends
> junit test for this is attached, it's a simple process with just a fork and a join and some scripts inbetween.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list