JBoss Community

JobExecutor becomes broken and cannot recover (JBPM3.2.2)

created by Petr Sochurek in jBPM - View the full discussion

Problem: JobExecutor thread becomes broken due to unknown transactional problem. As a reason no tasks are executed and only JBoss restart helps.

 

Configuration and environment
- components stack: JBOSS 4.2.3, JBOSS TS 4.2.3, JBPM 3.2.2, Hibernate 3.2.4.sp1.cp04
- database server: MySQL 5.1.47
- distributed transaction only within the one database server
- operated on several Linux environments

 

 

The problem occurs irregurarly (once per several days and only on some environments). When it happens following errors are logged by JBPMExecutor thread:

2010-12-15 15:53:20,566 WARN  com.arjuna.ats.arjuna.logging.arjLoggerI18N [com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator_2

TwoPhaseCoordinator.beforeCompletion - failed for nulljava.lang.NullPointerExceptionat org.hibernate.connection.DatasourceConnectionProvider.closeConnection(DatasourceConnectionProvider.java:74)at org.hibernate.jdbc.ConnectionManager.closeConnection(ConnectionManager.java:451)at org.hibernate.jdbc.ConnectionManager.aggressiveRelease(ConnectionManager.java:406)at org.hibernate.jdbc.ConnectionManager.afterStatement(ConnectionManager.java:281)at org.hibernate.jdbc.ConnectionManager.flushEnding(ConnectionManager.java:480)at org.hibernate.event.def.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:305)at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:27)at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:1000)at org.hibernate.impl.SessionImpl.managedFlush(SessionImpl.java:338)at org.hibernate.transaction.CacheSynchronization.beforeCompletion(CacheSynchronization.java:59)at com.arjuna.ats.internal.jta.resources.arjunacore.SynchronizationImple.beforeCompletion(SynchronizationImple.java:114)at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.beforeCompletion(TwoPhaseCoordinator.java:247)at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:86)at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:177)at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1389)at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:135)at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:87)at org.jboss.tm.usertx.client.ServerVMClientUserTransaction.commit(ServerVMClientUserTransaction.java:140)at org.jbpm.persistence.jta.JtaDbPersistenceService.endJtaTransaction(JtaDbPersistenceService.java:80)at org.jbpm.persistence.jta.JtaDbPersistenceService.close(JtaDbPersistenceService.java:41)at org.jbpm.svc.Services.close(Services.java:225)at org.jbpm.JbpmContext.close(JbpmContext.java:139)at org.jbpm.job.executor.JobExecutorThread.acquireJobs(JobExecutorThread.java:144)at org.jbpm.job.executor.JobExecutorThread.run(JobExecutorThread.java:58)2010-12-15 15:53:20,566 WARN  com.arjuna.ats.arjuna.logging.arjLoggerI18N com.arjuna.ats.arjuna.coordinator.CheckedAction_2 - CheckedAction::check - atomic action

a471f1f:c46e:4d08ce0f:c5a aborting with 1 threads active!2010-12-15 15:53:20,582 ERROR org.jbpm.svc.Services problem closing service 'persistence'org.jbpm.JbpmException: couldn't commit JTA transactionat org.jbpm.persistence.jta.JtaDbPersistenceService.endJtaTransaction(JtaDbPersistenceService.java:82)at org.jbpm.persistence.jta.JtaDbPersistenceService.close(JtaDbPersistenceService.java:41)at org.jbpm.svc.Services.close(Services.java:225)at org.jbpm.JbpmContext.close(JbpmContext.java:139)at org.jbpm.job.executor.JobExecutorThread.acquireJobs(JobExecutorThread.java:144)at org.jbpm.job.executor.JobExecutorThread.run(JobExecutorThread.java:58)Caused by: java.lang.NullPointerExceptionat com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.beforeCompletion(TwoPhaseCoordinator.java:235)at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:86)at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:177)at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1389)at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:135)at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:87)at org.jboss.tm.usertx.client.ServerVMClientUserTransaction.commit(ServerVMClientUserTransaction.java:140)at org.jbpm.persistence.jta.JtaDbPersistenceService.endJtaTransaction(JtaDbPersistenceService.java:80)... 5 more2010-12-15 15:53:20,589 ERROR org.jbpm.job.executor.JobExecutorThread exception in job executor thread. waiting 5000 millisecondsorg.jbpm.JbpmException: problem closing services {persistence=org.jbpm.JbpmException: couldn't commit JTA transaction}at org.jbpm.svc.Services.close(Services.java:245)at org.jbpm.JbpmContext.close(JbpmContext.java:139)at org.jbpm.job.executor.JobExecutorThread.acquireJobs(JobExecutorThread.java:144)at org.jbpm.job.executor.JobExecutorThread.run(JobExecutorThread.java:58)Caused by: org.jbpm.JbpmException: couldn't commit JTA transactionat org.jbpm.persistence.jta.JtaDbPersistenceService.endJtaTransaction(JtaDbPersistenceService.java:82)at org.jbpm.persistence.jta.JtaDbPersistenceService.close(JtaDbPersistenceService.java:41)at org.jbpm.svc.Services.close(Services.java:225)... 3 moreCaused by: java.lang.NullPointerExceptionat com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.beforeCompletion(TwoPhaseCoordinator.java:235)at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:86)at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:177)at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1389)at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:135)at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:87)at org.jboss.tm.usertx.client.ServerVMClientUserTransaction.commit(ServerVMClientUserTransaction.java:140)at org.jbpm.persistence.jta.JtaDbPersistenceService.endJtaTransaction(JtaDbPersistenceService.java:80)

... 5 more

 

 

From now all the successive iterations of JobExecutor always end up with following error (and the waiting time doubles after error):

2010-12-15 15:53:23,524 ERROR org.jbpm.job.executor.JobExecutorThread exception in job executor thread. waiting 10000 milliseconds

org.jbpm.JbpmException: couldn't start JTA transaction

at org.jbpm.persistence.jta.JtaDbPersistenceService.beginJtaTransaction(JtaDbPersistenceService.java:55)

at org.jbpm.persistence.jta.JtaDbPersistenceService.(JtaDbPersistenceService.java:28)

at org.jbpm.persistence.jta.JtaDbPersistenceServiceFactory.openService(JtaDbPersistenceServiceFactory.java:17)

at org.jbpm.svc.Services.getService(Services.java:144)

at org.jbpm.svc.Services.getPersistenceService(Services.java:183)

at org.jbpm.JbpmContext.getPersistenceService(JbpmContext.java:628)

at org.jbpm.JbpmContext.getJobSession(JbpmContext.java:561)

at org.jbpm.job.executor.JobExecutorThread.acquireJobs(JobExecutorThread.java:112)

at org.jbpm.job.executor.JobExecutorThread.run(JobExecutorThread.java:58)

Caused by: javax.transaction.NotSupportedException

at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.begin(BaseTransaction.java:79)

at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.begin(BaseTransactionManagerDelegate.java:77)

at org.jboss.tm.usertx.client.ServerVMClientUserTransaction.begin(ServerVMClientUserTransaction.java:124)

at org.jbpm.persistence.jta.JtaDbPersistenceService.beginJtaTransaction(JtaDbPersistenceService.java:53)

... 8 more

 

 

I suppose that the root cause is some database connection problem or network problem (maybe firewall). The problem occurs irregurarly only on production environments and I wasn't able to find out what it coudl be.

 

Anyway, I would expect that  regardless of such low-level errors, the JBPM (or Hibernate, Arjuna or anything else inside) should be able to fully recover.

 

Locally I can reproduce problem by locking Mysql database tables using scenario:

1) deployed testing jbpm process whose transitions are changed automatically every minute
2) locked mysql database tables using "FLUSH TABLES WITH READ LOCK \G;" command
3) wait 1 hour
4) unlock database tables using "UNLOCK TABLES;" command
5) now errors (see above) appear in the log


Can you please give me any advice how to ensure that JBPM recovers correctly?

Maybe I omited some important configuration in Hibernate/Arjuna/JBPM, Jboss (the JBPM datasource is configured correctly now).

I cannot belive that the JBOSS restart is sufficent solution.


Thanks a lot.
Petr

Reply to this message by going to Community

Start a new discussion in jBPM at Community