[rules-users] Strange intermittent problem with Drools Flow

Dan Nathanson dan at ddnconsulting.com
Mon Apr 11 19:37:33 EDT 2011


Hi Mauricio,

This is now happening intermittently for another user and a couple of
times on our build machine.  I have never seen it happen on my
machine.  This will be a show-stopper for us if we cannot figure out
the cause.  Our business processes will stall after one user completes
their task and the next user is never notified that they have a task,
and no errors will be reported.

Regards,

Dan Nathanson




On Mon, Apr 4, 2011 at 9:27 PM, Dan Nathanson <dan at ddnconsulting.com> wrote:
> Hi Mauricio,
>
> Were you able to make determine anything from the info I sent you?
>
> Regards,
>
> Dan Nathanson
>
> On Fri, Apr 1, 2011 at 11:03 AM, Dan Nathanson <dan at ddnconsulting.com> wrote:
>> Hi Mauricio,
>>
>> Thanks for looking into this. These are the types of errors that scare
>> me.  They only happen in one environment and I cannot reproduce them.
>> How will I know if it happens in production?
>>
>> We generate our rule flow's dynamically from our own concept of a
>> process flow.  I don't have a custom work item implementation, but I
>> do have a custom WorkItemHandler registered against work items of type
>> "Step" called StepWorkItemHandler. StepWorkItemHandler has uses and
>> injected CallbackHandler to do actual processing.  In test code, the
>> injected callback handler just records what steps (work items) have
>> been activated.  In production code, the injected handler sends JMS
>> messages to notify assigned users that work needs to be completed.
>>
>> Below is an example of a simple generated flow that fails
>> intermittently for one developer here.  I captured it using the
>> XmlRuleFlowProcessDumper.
>>
>> <?xml version="1.0" encoding="UTF-8"?>
>> <process xmlns="http://drools.org/drools-5.0/process"
>>         xmlns:xs="http://www.w3.org/2001/XMLSchema-instance"
>>         xs:schemaLocation="http://drools.org/drools-5.0/process
>> drools-processes-5.0.xsd"
>>         type="RuleFlow" name="flow-1"
>> id="bd5556f7_581d_422f_a382_3bb3ac041c4f"
>> package-name="com.xxx.process.manager" >
>>
>>  <header>
>>    <imports>
>>      <import name="org.drools.ruleflow.instance.RuleFlowProcessInstance" />
>>    </imports>
>>  </header>
>>
>>  <nodes>
>>    <start id="1" name="flow-1 Start" />
>>    <join id="2" name="step-1 Reject Join" type="2" />
>>    <workItem id="3" name="step-1 Step" >
>>      <work name="Step" >
>>      </work>
>>    </workItem>
>>    <workItem id="4" name="step-1 Approval" >
>>      <work name="Step" >
>>      </work>
>>    </workItem>
>>    <split id="5" name="step-1 Approval Split" type="2" >
>>      <constraints>
>>        <constraint toNodeId="2" toType="DROOLS_DEFAULT"
>> name="rejected" priority="110" type="code" dialect="java" >return
>> true;</constraint>
>>        <constraint toNodeId="6" toType="DROOLS_DEFAULT"
>> name="approved" priority="100" type="rule" dialect="mvel"
>>>StepApproval( approved == true, sliceValue == "NO_SLICE",
>> approvalStepNodeId == "1a8a5a64-1e32-46a9-8b04-806c1706a97a"
>> )</constraint>
>>      </constraints>
>>    </split>
>>    <state id="6" name="null Wait" >
>>      <onEntry>
>>        <action type="expression" dialect="java"
>>>Terminator.terminateFlow(context);</action>
>>      </onEntry>
>>      <constraints>
>>        <constraint toNodeId="7" >Terminator()</constraint>
>>      </constraints>
>>    </state>
>>    <end id="7" name="null End" />
>>  </nodes>
>>
>>  <connections>
>>    <connection from="1" to="2" />
>>    <connection from="5" to="2" />
>>    <connection from="2" to="3" />
>>    <connection from="3" to="4" />
>>    <connection from="4" to="5" />
>>    <connection from="5" to="6" />
>>    <connection from="6" to="7" />
>>  </connections>
>>
>> </process>
>>
>>
>>
>> I'm attaching the log at trace level (there are no WARN level messages
>> at all) that shows this flow failing to complete.  Lines of interest:
>> 631: Generating the flow from our internal model
>> 634: starting the flow
>> 785: Process event listener shows that flow has started and proceeded
>> to work item 1 ("step-1 Step")
>> 802: StepWorkItemHandler called for work item 1 "step-1 Step"
>> 803: Process event listener shows "unwinding"of activations
>> 904: Work item 1 "step-1 Step" completed
>> 921: work item loaded
>> 939: process instance loaded
>> 956: Process event listener shows leaving "step-1 Step", activating
>> work item 2 "step-1 Approval"
>> 969: StepWorkItemHandler called forwork item 2 "step-1 Approval"
>> 970: Process event listener shows "unwinding"of activations
>> 1024: work item 1 deleted
>> 1032: Work item 2 "step-1 Approval" completed
>> 1049: work item loaded
>> 1067: process instance loaded
>> ---Should now see leaving "step-1 Approval" and completing flow, but
>> don't see any further progress in the flow.---
>> 1141: work item 2 deleted
>>
>> The execution thread is part of the log output and it seems that
>> everything is done in one thread.
>>
>> I've also attached a log file (without verbose hibernate output) for a
>> successful test run.
>>
>> Regards,
>>
>> Dan Nathanson
>>
>>
>>
>>
>> On Fri, Apr 1, 2011 at 2:42 AM, Mauricio Salatino <salaboy at gmail.com> wrote:
>>> Hi Dan,
>>> That's strange, it could be related to a non updated version of the fluent
>>> API.
>>> Send us the logs with warn verbosity and we will definitely take a look on
>>> it.
>>> If the workflow don't continue after a work item completion could be related
>>> with some kind of threading problem.
>>> What kind of work item are you implementing? Can you share some test with us
>>> that shows the desired behavior?
>>> Greetings.
>>>
>>> On Thu, Mar 31, 2011 at 4:32 PM, Dan Nathanson <dan at ddnconsulting.com>
>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'm seeing some odd behavior in some of my test cases.  And it only
>>>> seems to happen to one guy. And he's done fresh checkouts of the code,
>>>> blown away his local M2 repository and verified installed software
>>>> like OS and Java is same as everyone else's.
>>>>
>>>> I have some test cases that build up some simple flows programatically
>>>> using fluent API.  Very simple (start --> work item --> work item -->
>>>> state --> end).  I am using Drools Flow 5.1.1 with JPA (in-memory H2
>>>> DB for unit tests).  Intermittently, after completing a work item, the
>>>> flow doesn't continue.  Logging in a process event listener shows that
>>>> the the work item node is never left, although I can see in the logs
>>>> that the work item is deleted from DB.
>>>>
>>>> There are no errors, warning or info level messages coming out of
>>>> Drools or Hibernate prior to the failure.
>>>>
>>>> It only happens to one guy, but he can reproduce the problem
>>>> regularly, although it moves around in different test cases and
>>>> different points in the flows.
>>>>
>>>> Anyone ever seen this behavior before?  Any possible explanations?
>>>>
>>>> I'd attach the log file, but it is huge since I've got hibernate
>>>> logging set very verbose.
>>>>
>>>> Regards,
>>>>
>>>> Dan Nathanson
>>>> _______________________________________________
>>>> rules-users mailing list
>>>> rules-users at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/rules-users
>>>
>>>
>>>
>>> --
>>>  - CTO @ http://www.plugtree.com
>>>  - MyJourney @ http://salaboy.wordpress.com
>>>  - Co-Founder @ http://www.jbug.com.ar
>>>
>>>  - Salatino "Salaboy" Mauricio -
>>>
>>
>




More information about the rules-users mailing list