[JBoss JIRA] (WFLY-3549) Deadlock during shutdown
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/WFLY-3549?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on WFLY-3549:
-----------------------------------------------
Tristan Tarrant <ttarrant(a)redhat.com> changed the Status of [bug 1184532|https://bugzilla.redhat.com/show_bug.cgi?id=1184532] from NEW to ASSIGNED
> Deadlock during shutdown
> ------------------------
>
> Key: WFLY-3549
> URL: https://issues.jboss.org/browse/WFLY-3549
> Project: WildFly
> Issue Type: Bug
> Affects Versions: 8.1.0.Final
> Reporter: Dan Berindei
> Assignee: David Lloyd
> Fix For: 8.2.0.Final
>
>
> This deadlock appeared in an Arquillian test:
> {noformat}
> Found one Java-level deadlock:
> =============================
> "undefined":
> waiting to lock monitor 0x00007f67a421bfa8 (object 0x00000000e0700480, a org.jboss.as.threads.ScheduledThreadPoolService),
> which is held by "MSC service thread 1-2"
> "MSC service thread 1-2":
> waiting for ownable synchronizer 0x00000000e0700618, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
> which is held by "undefined"
> Java stack information for the threads listed above:
> ===================================================
> "undefined":
> at org.jboss.as.threads.ScheduledThreadPoolService$ExecutorImpl.terminated(ScheduledThreadPoolService.java:121)
> - waiting to lock <0x00000000e0700480> (a org.jboss.as.threads.ScheduledThreadPoolService)
> at java.util.concurrent.ThreadPoolExecutor.tryTerminate(ThreadPoolExecutor.java:704)
> at java.util.concurrent.ThreadPoolExecutor.processWorkerExit(ThreadPoolExecutor.java:1006)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1163)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:724)
> at org.jboss.threads.JBossThread.run(JBossThread.java:122)
> "MSC service thread 1-2":
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00000000e0700618> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
> at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
> at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
> at java.util.concurrent.ThreadPoolExecutor.interruptIdleWorkers(ThreadPoolExecutor.java:781)
> at java.util.concurrent.ThreadPoolExecutor.tryTerminate(ThreadPoolExecutor.java:695)
> at java.util.concurrent.ThreadPoolExecutor.shutdown(ThreadPoolExecutor.java:1397)
> at java.util.concurrent.ScheduledThreadPoolExecutor.shutdown(ScheduledThreadPoolExecutor.java:759)
> at org.jboss.as.threads.ManagedScheduledExecutorService.internalShutdown(ManagedScheduledExecutorService.java:53)
> at org.jboss.as.threads.ScheduledThreadPoolService.stop(ScheduledThreadPoolService.java:67)
> - locked <0x00000000e0700480> (a org.jboss.as.threads.ScheduledThreadPoolService)
> at org.jboss.msc.service.ServiceControllerImpl$StopTask.stopService(ServiceControllerImpl.java:2056)
> at org.jboss.msc.service.ServiceControllerImpl$StopTask.run(ServiceControllerImpl.java:2017)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:724)
> Found 1 deadlock.
> {noformat}
> Looks like two MSC service threads exited and tried to terminate the thread pool at the same time. And because the MSC threads are not daemon threads, the entire JVM hangs and blocks the Arquillian test that was waiting for the container to shut down.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
10 years, 11 months
[JBoss JIRA] (JGRP-1898) FD_HOST many false suspect with Full GC
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/JGRP-1898?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on JGRP-1898:
-----------------------------------------------
Dave Stahl <dstahl(a)redhat.com> changed the Status of [bug 1161529|https://bugzilla.redhat.com/show_bug.cgi?id=1161529] from VERIFIED to CLOSED
> FD_HOST many false suspect with Full GC
> ---------------------------------------
>
> Key: JGRP-1898
> URL: https://issues.jboss.org/browse/JGRP-1898
> Project: JGroups
> Issue Type: Enhancement
> Affects Versions: 3.5.1
> Reporter: Takayoshi Kimura
> Assignee: Takayoshi Kimura
> Fix For: 3.4.7, 3.5.2, 3.6.1
>
> Attachments: FD_HOSTTest.java, test-fdhost.zip
>
>
> Currently FD_HOST PingTask has 2 loops, ping loop and cheking timeout loop.
> {code}
> for (h: hosts) { ping_and_update_timestamp(host) }
> current = System.currentTimeMillis();
> for (h: hosts) { compare current and (ping_timestmp + timeout) }
> {code}
> Testing with large number of hosts, after lengthy Full GC during the ping loop, FD_HOST checks timeout and it counts the Full GC time in, sometimes causes many false suspects.
> For example, 1 min Full GC and 50 sec timeout, all hosts are suspected with current implementation.
> To reduce the impact of the Full GC time, we can combine the 2 loops into 1 loop, ping and checking timeout each host, so the Full GC delay only affects to a single host and never affect to other hosts.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
10 years, 11 months
[JBoss JIRA] (JGRP-1898) FD_HOST many false suspect with Full GC
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/JGRP-1898?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on JGRP-1898:
-----------------------------------------------
Dave Stahl <dstahl(a)redhat.com> changed the Status of [bug 1161529|https://bugzilla.redhat.com/show_bug.cgi?id=1161529] from VERIFIED to CLOSED
> FD_HOST many false suspect with Full GC
> ---------------------------------------
>
> Key: JGRP-1898
> URL: https://issues.jboss.org/browse/JGRP-1898
> Project: JGroups
> Issue Type: Enhancement
> Affects Versions: 3.5.1
> Reporter: Takayoshi Kimura
> Assignee: Takayoshi Kimura
> Fix For: 3.4.7, 3.5.2, 3.6.1
>
> Attachments: FD_HOSTTest.java, test-fdhost.zip
>
>
> Currently FD_HOST PingTask has 2 loops, ping loop and cheking timeout loop.
> {code}
> for (h: hosts) { ping_and_update_timestamp(host) }
> current = System.currentTimeMillis();
> for (h: hosts) { compare current and (ping_timestmp + timeout) }
> {code}
> Testing with large number of hosts, after lengthy Full GC during the ping loop, FD_HOST checks timeout and it counts the Full GC time in, sometimes causes many false suspects.
> For example, 1 min Full GC and 50 sec timeout, all hosts are suspected with current implementation.
> To reduce the impact of the Full GC time, we can combine the 2 loops into 1 loop, ping and checking timeout each host, so the Full GC delay only affects to a single host and never affect to other hosts.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
10 years, 11 months
[JBoss JIRA] (JGRP-1898) FD_HOST many false suspect with Full GC
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/JGRP-1898?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on JGRP-1898:
-----------------------------------------------
Dave Stahl <dstahl(a)redhat.com> changed the Status of [bug 1161529|https://bugzilla.redhat.com/show_bug.cgi?id=1161529] from VERIFIED to CLOSED
> FD_HOST many false suspect with Full GC
> ---------------------------------------
>
> Key: JGRP-1898
> URL: https://issues.jboss.org/browse/JGRP-1898
> Project: JGroups
> Issue Type: Enhancement
> Affects Versions: 3.5.1
> Reporter: Takayoshi Kimura
> Assignee: Takayoshi Kimura
> Fix For: 3.4.7, 3.5.2, 3.6.1
>
> Attachments: FD_HOSTTest.java, test-fdhost.zip
>
>
> Currently FD_HOST PingTask has 2 loops, ping loop and cheking timeout loop.
> {code}
> for (h: hosts) { ping_and_update_timestamp(host) }
> current = System.currentTimeMillis();
> for (h: hosts) { compare current and (ping_timestmp + timeout) }
> {code}
> Testing with large number of hosts, after lengthy Full GC during the ping loop, FD_HOST checks timeout and it counts the Full GC time in, sometimes causes many false suspects.
> For example, 1 min Full GC and 50 sec timeout, all hosts are suspected with current implementation.
> To reduce the impact of the Full GC time, we can combine the 2 loops into 1 loop, ping and checking timeout each host, so the Full GC delay only affects to a single host and never affect to other hosts.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
10 years, 11 months
[JBoss JIRA] (JGRP-1898) FD_HOST many false suspect with Full GC
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/JGRP-1898?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on JGRP-1898:
-----------------------------------------------
Dave Stahl <dstahl(a)redhat.com> changed the Status of [bug 1161529|https://bugzilla.redhat.com/show_bug.cgi?id=1161529] from VERIFIED to CLOSED
> FD_HOST many false suspect with Full GC
> ---------------------------------------
>
> Key: JGRP-1898
> URL: https://issues.jboss.org/browse/JGRP-1898
> Project: JGroups
> Issue Type: Enhancement
> Affects Versions: 3.5.1
> Reporter: Takayoshi Kimura
> Assignee: Takayoshi Kimura
> Fix For: 3.4.7, 3.5.2, 3.6.1
>
> Attachments: FD_HOSTTest.java, test-fdhost.zip
>
>
> Currently FD_HOST PingTask has 2 loops, ping loop and cheking timeout loop.
> {code}
> for (h: hosts) { ping_and_update_timestamp(host) }
> current = System.currentTimeMillis();
> for (h: hosts) { compare current and (ping_timestmp + timeout) }
> {code}
> Testing with large number of hosts, after lengthy Full GC during the ping loop, FD_HOST checks timeout and it counts the Full GC time in, sometimes causes many false suspects.
> For example, 1 min Full GC and 50 sec timeout, all hosts are suspected with current implementation.
> To reduce the impact of the Full GC time, we can combine the 2 loops into 1 loop, ping and checking timeout each host, so the Full GC delay only affects to a single host and never affect to other hosts.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
10 years, 11 months