[jboss-jira] [JBoss JIRA] Commented: (JBPM-983) concurrent process execution fails

Wed Sep 5 11:38:12 EDT 2007

    [ http://jira.jboss.com/jira/browse/JBPM-983?page=comments#action_12375409 ] 

Edward Staub commented on JBPM-983:
-----------------------------------

Re "when a ProcessState is completed on a job that started in the subprocess, a collision can occur with another thread concurrently running in the parent process, resulting in a StaleObjectStateException".

Here's more info, let me know if it's not enough.

In ProcessInstance.end(), there's a bit of code that signals the ProcessInstance:

      // check if this process was started as a subprocess of a super process
      if (superProcessToken!=null) {
        addCascadeProcessInstance(superProcessToken.getProcessInstance());

        ExecutionContext superExecutionContext = new ExecutionContext(superProcessToken);
        superExecutionContext.setSubProcessInstance(this);
        superProcessToken.signal(superExecutionContext);
      }

Remember, in the failure scenario the subprocess has passed an async node, so it's running on a "new" thread.  If it completes very quickly, the calling token may not yet be saved, resulting in a SOSException on the token.  

I don't know of a way, using just test code, to guarantee reproduction of this problem.  A delay must be guaranteed between posting the job (in GraphElement.executeActions() and saving the Hibernate session.  You might want to temporarily insert a delay into GraphElement.executeActions(), after the messageService.send(job) call.  For me, just turning on enough logging to the console is sufficient.

This is another case that a higher transaction isolation MAY take care of.  But I'm pretty doubtful; it seems like code in ProcessInstance.end() must block somehow until the parent token is done executing.  If I understand it correctly, job-posting is not performed as part of the overall JBPM transaction, because it's on a separate Hibernate session.  Correct?

In my workaround, it submits a job to do the signal if the conditions warrant it, and job token-exclusion mechanisms (made more robust to eliminate SOSExceptions) take care of the rest.  I don't like this much.  Maybe a LOCK_UPGRADE on the Token could be used instead, before the messageService.send() in GraphElement.executeActions()?  I don't know what Hypersonic supports.  We know the thread is about to save the session in this context, so the lock would only be held for a short time.

This may seem like a bizarre scenario - who would call a subprocess that immediately returns?  Likely cases include those where the subprocess immediately bails out because of a business-logic assertion failure of some kind, or where the subprocess was factored out of the main process just to make it easy to independently modify.

> concurrent process execution fails
> ----------------------------------
>
>                 Key: JBPM-983
>                 URL: http://jira.jboss.com/jira/browse/JBPM-983
>             Project: JBoss jBPM
>          Issue Type: Bug
>          Components: Core Engine
>    Affects Versions: jBPM jPDL 3.2
>         Environment: Hypersonic in-memory database, JobExecutor configured with 5 threads
>            Reporter: Alexander Schlett
>         Assigned To: Tom Baeyens
>            Priority: Critical
>             Fix For: jBPM jPDL 3.2.2
>
>         Attachments: SimpleTest.java, SimpleTest.java, SimpleTest.java, SimpleTest.java, SimpleTest.java
>
>
> concurrent execution of async nodes with multiple JobExecutor threads fails. the effect is:
> 1) job sync within JobExecutor fails due to org.hibernate.StaleObjectStateException
> 2) process gets stuck in join node and never ends
> junit test for this is attached, it's a simple process with just a fork and a join and some scripts inbetween.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira