Every once in a while (once a few weeks roughly) we encounter this really bad issue in production running jBPM 5.1. A few threads get stuck forever utilizing a lot of CPU resources.
The thread dump looks as follows:
"Dispatcher-Channel-25" daemon prio=10 tid=0x000000005bbfc800 nid=0x7035 runnable [0x00000000455ba000]
java.lang.Thread.State: RUNNABLE
at java.util.HashMap.get(HashMap.java:303)
at org.jbpm.process.instance.event.DefaultSignalManager.addEventListener(DefaultSignalManager.java:53)
at org.jbpm.workflow.instance.impl.WorkflowProcessInstanceImpl.addEventListener(WorkflowProcessInstanceImpl.java:382)
at org.jbpm.workflow.instance.node.SubProcessNodeInstance.addProcessListener(SubProcessNodeInstance.java:163)
at org.jbpm.workflow.instance.node.SubProcessNodeInstance.internalTrigger(SubProcessNodeInstance.java:132)
at org.jbpm.workflow.instance.impl.NodeInstanceImpl.trigger(NodeInstanceImpl.java:122)
at org.jbpm.workflow.instance.impl.NodeInstanceImpl.triggerConnection(NodeInstanceImpl.java:185)
at org.jbpm.workflow.instance.impl.NodeInstanceImpl.triggerCompleted(NodeInstanceImpl.java:150)
at org.jbpm.workflow.instance.node.SplitInstance.internalTrigger(SplitInstance.java:61)
at org.jbpm.workflow.instance.impl.NodeInstanceImpl.trigger(NodeInstanceImpl.java:122)
at org.jbpm.workflow.instance.impl.NodeInstanceImpl.triggerConnection(NodeInstanceImpl.java:185)
at org.jbpm.workflow.instance.impl.NodeInstanceImpl.triggerCompleted(NodeInstanceImpl.java:150)
at org.jbpm.workflow.instance.node.StartNodeInstance.triggerCompleted(StartNodeInstance.java:49)
at org.jbpm.workflow.instance.node.StartNodeInstance.internalTrigger(StartNodeInstance.java:41)
at org.jbpm.workflow.instance.impl.NodeInstanceImpl.trigger(NodeInstanceImpl.java:122)
at org.jbpm.ruleflow.instance.RuleFlowProcessInstance.internalStart(RuleFlowProcessInstance.java:35)
at org.jbpm.process.instance.impl.ProcessInstanceImpl.start(ProcessInstanceImpl.java:188)
- locked <0x00002aaad8bf9ec8> (a org.jbpm.ruleflow.instance.RuleFlowProcessInstance)
at org.jbpm.workflow.instance.impl.WorkflowProcessInstanceImpl.start(WorkflowProcessInstanceImpl.java:302)
- locked <0x00002aaad8bf9ec8> (a org.jbpm.ruleflow.instance.RuleFlowProcessInstance)
at org.jbpm.process.instance.ProcessRuntimeImpl.startProcessInstance(ProcessRuntimeImpl.java:154)
at org.jbpm.process.instance.ProcessRuntimeImpl.startProcess(ProcessRuntimeImpl.java:124)
at org.drools.common.AbstractWorkingMemory.startProcess(AbstractWorkingMemory.java:1095)
at org.drools.impl.StatefulKnowledgeSessionImpl.startProcess(StatefulKnowledgeSessionImpl.java:306)
at com.jpm.wss.gfs.wq.bpm.jbpm5.Jbpm5SessionImpl.startProcess(Jbpm5SessionImpl.java:532)
Typically three threads would be in trouble. Here’s the last UNIX top command showing the three threads running very high CPU:
28725 a_tyger0 25 0 3638m 1.9g 10m R 98.5 24.0 27:20.43 /usr/java/bin/java -Djava.util.logging.config.file=/app/tyger/fcHome/deployment/bpm_201_01/conf/
28730 a_tyger0 25 0 3638m 1.9g 10m R 98.5 24.0 27:07.01 /usr/java/bin/java -Djava.util.logging.config.file=/app/tyger/fcHome/deployment/bpm_201_01/conf/
28726 a_tyger0 25 0 3638m 1.9g 10m R 98.1 24.0 30:29.38 /usr/java/bin/java -Djava.util.logging.config.file=/app/tyger/fcHome/deployment/bpm_201_01/conf/
The only way to resolve the condition is to restart the JVM. It’s a very serious issue in our environment putting in doubt the whole idea of using jBPM5 for a critical production application.
Any insight will be deeply appreciated.