[jboss-jira] [JBoss JIRA] (WFLY-13357) (Regression) Execution of concurrent batch jobs containg partitioned steps causes deadlock

Cheng Fang (Jira) issues at jboss.org
Wed Apr 29 14:10:21 EDT 2020


    [ https://issues.redhat.com/browse/WFLY-13357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14066569#comment-14066569 ] 

Cheng Fang commented on WFLY-13357:
-----------------------------------

To reproduce it, see https://github.com/jberet/jberet-wildfly-samples/tree/master/throttle

> (Regression) Execution of concurrent batch jobs containg partitioned steps causes deadlock
> ------------------------------------------------------------------------------------------
>
>                 Key: WFLY-13357
>                 URL: https://issues.redhat.com/browse/WFLY-13357
>             Project: WildFly
>          Issue Type: Bug
>          Components: Batch
>    Affects Versions: 19.0.0.Final
>            Reporter: Felix König
>            Assignee: Cheng Fang
>            Priority: Major
>             Fix For: 20.0.0.Beta1
>
>
> Hello,
> the issue described in JBERET-180 seems to have reappeared. I am running Wildfly 16 with jberet-1.3.3. Given that there is a default batch-thread count of 10 I was able to produce a deadlock by starting 10 instances of a partitioned job simultaneously. None of the job runs fast enough to finish before all 10 jobs have been started. All 10 Batch-threads are stuck here:
> {code}
> "Batch Thread - 1 at 33537" prio=5 tid=0x109 nid=NA waiting
>   java.lang.Thread.State: WAITING
> 	  at jdk.internal.misc.Unsafe.park(Unknown Source:-1)
> 	  at java.util.concurrent.locks.LockSupport.park(Unknown Source:-1)
> 	  at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Unknown Source:-1)
> 	  at java.util.concurrent.ArrayBlockingQueue.take(Unknown Source:-1)
> 	  at org.jberet.runtime.runner.StepExecutionRunner.beginPartition(StepExecutionRunner.java:350)
> 	  at org.jberet.runtime.runner.StepExecutionRunner.runBatchletOrChunk(StepExecutionRunner.java:222)
> 	  at org.jberet.runtime.runner.StepExecutionRunner.run(StepExecutionRunner.java:144)
> 	  at org.jberet.runtime.runner.CompositeExecutionRunner.runStep(CompositeExecutionRunner.java:164)
> 	  at org.jberet.runtime.runner.CompositeExecutionRunner.runFromHeadOrRestartPoint(CompositeExecutionRunner.java:88)
> 	  at org.jberet.runtime.runner.JobExecutionRunner.run(JobExecutionRunner.java:60)
> 	  at org.wildfly.extension.batch.jberet.deployment.BatchEnvironmentService$WildFlyBatchEnvironment$1.run(BatchEnvironmentService.java:180)
> 	  at org.wildfly.extension.requestcontroller.RequestController$QueuedTask$1.run(RequestController.java:494)
> 	  at org.jberet.spi.JobExecutor$2.run(JobExecutor.java:149)
> 	  at org.jberet.spi.JobExecutor$1.run(JobExecutor.java:99)
> 	  at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source:-1)
> 	  at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source:-1)
> 	  at java.lang.Thread.run(Unknown Source:-1)
> 	  at org.jboss.threads.JBossThread.run(JBossThread.java:485)
> {code}
> which is this line of code:
> {code:java}
> completedPartitionThreads.take();
> {code}
> Rarely some threads also get stuck at line 364 instead, which is
> {code:java}
> final Serializable data = collectorDataQueue.take();
> {code}



--
This message was sent by Atlassian Jira
(v7.13.8#713008)



More information about the jboss-jira mailing list