[jboss-jira] [JBoss JIRA] (AG-145) Active waiting deadlock in StampedCopyOnWriteArrayList
Rene Böing (Jira)
issues at jboss.org
Fri Jul 31 02:46:00 EDT 2020
[ https://issues.redhat.com/browse/AG-145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rene Böing updated AG-145:
--------------------------
Description:
While using agroal connection pool, we discovered some rare deadlock, which are causing 100% cpu on some threads. These deadlocks occur in the StampedCopyOnWriteArrayList class, when there is more than one thread trying to remove the same object.
A simple reproducer in junit (fails nearly every time on my machine):
{code:java}
@Test
public void testThis() {
ExecutorService service = Executors.newFixedThreadPool(10);
StampedCopyOnWriteArrayList<Object> list = new StampedCopyOnWriteArrayList<>(Object.class);
Object o = new Object();
list.add(new Object());
list.add(new Object());
list.add(new Object());
list.add(new Object());
list.add(o);
list.add(new Object());
List<Runnable> runnerList = new ArrayList<>(10);
List<Future> futureList = new ArrayList<>(10);
for (int i = 0; i < 10; i++) {
runnerList.add(new Runnable() {
@Override
public void run() {
list.remove(o);
System.out.println("Removed success!");
}
});
}
for (Runnable r : runnerList) {
futureList.add(service.submit(r));
}
for (Future r : futureList) {
try {
r.get(10000, TimeUnit.MILLISECONDS);
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
} catch (TimeoutException e) {
System.out.println("Seems like we have a deadlock!");
}
}
}
{code}
Originally this deadlock seems to occur, when agroal tries to flush a connection due to the config parameter
<property name="hibernate.agroal.maxLifetime_m">60</property>
If at the same time another thread using this connection calls session.close there is a possibility in the ConnectionPool.class getting called twice. The parameter goes through the following path:
!image-2020-07-31-08-39-28-630.png!
The parallel session.close call does not find a checked_out connection and tries to flush it instead, hence two Threads are getting into the deadlock situation:
!image-2020-07-31-08-40-41-968.png!
Kind regards,
Rene
was:
While using agroal connection pool, we discovered some rare deadlock, which are causing 100% cpu on some threads. These deadlocks occur in the StampedCopyOnWriteArrayList class, when there is more than one thread trying to remove the same object.
A simple reproduce in junit (fails nearly every time on my machine):
{code:java}
@Test
public void testThis() {
ExecutorService service = Executors.newFixedThreadPool(10);
StampedCopyOnWriteArrayList<Object> list = new StampedCopyOnWriteArrayList<>(Object.class);
Object o = new Object();
list.add(new Object());
list.add(new Object());
list.add(new Object());
list.add(new Object());
list.add(o);
list.add(new Object());
List<Runnable> runnerList = new ArrayList<>(10);
List<Future> futureList = new ArrayList<>(10);
for (int i = 0; i < 10; i++) {
runnerList.add(new Runnable() {
@Override
public void run() {
list.remove(o);
System.out.println("Removed success!");
}
});
}
for (Runnable r : runnerList) {
futureList.add(service.submit(r));
}
for (Future r : futureList) {
try {
r.get(10000, TimeUnit.MILLISECONDS);
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
} catch (TimeoutException e) {
System.out.println("Seems like we have a deadlock!");
}
}
}
{code}
Originally this deadlock seems to occur, when agroal tries to flush a connection due to the config parameter
<property name="hibernate.agroal.maxLifetime_m">60</property>
If at the same time another thread using this connection calls session.close there is a possibility in the ConnectionPool.class getting called twice. The parameter goes through the following path:
!image-2020-07-31-08-39-28-630.png!
The parallel session.close call does not find a checked_out connection and tries to flush it instead, hence two Threads are getting into the deadlock situation:
!image-2020-07-31-08-40-41-968.png!
Kind regards,
Rene
> Active waiting deadlock in StampedCopyOnWriteArrayList
> ------------------------------------------------------
>
> Key: AG-145
> URL: https://issues.redhat.com/browse/AG-145
> Project: Agroal
> Issue Type: Enhancement
> Affects Versions: 1.8
> Reporter: Rene Böing
> Assignee: Luis Barreiro
> Priority: Critical
> Attachments: image-2020-07-31-08-39-28-630.png, image-2020-07-31-08-40-41-968.png
>
>
> While using agroal connection pool, we discovered some rare deadlock, which are causing 100% cpu on some threads. These deadlocks occur in the StampedCopyOnWriteArrayList class, when there is more than one thread trying to remove the same object.
>
> A simple reproducer in junit (fails nearly every time on my machine):
>
> {code:java}
> @Test
> public void testThis() {
> ExecutorService service = Executors.newFixedThreadPool(10);
> StampedCopyOnWriteArrayList<Object> list = new StampedCopyOnWriteArrayList<>(Object.class);
> Object o = new Object();
> list.add(new Object());
> list.add(new Object());
> list.add(new Object());
> list.add(new Object());
> list.add(o);
> list.add(new Object());
> List<Runnable> runnerList = new ArrayList<>(10);
> List<Future> futureList = new ArrayList<>(10);
> for (int i = 0; i < 10; i++) {
> runnerList.add(new Runnable() {
> @Override
> public void run() {
> list.remove(o);
> System.out.println("Removed success!");
> }
> });
> }
> for (Runnable r : runnerList) {
> futureList.add(service.submit(r));
> }
> for (Future r : futureList) {
> try {
> r.get(10000, TimeUnit.MILLISECONDS);
> } catch (InterruptedException e) {
> e.printStackTrace();
> } catch (ExecutionException e) {
> e.printStackTrace();
> } catch (TimeoutException e) {
> System.out.println("Seems like we have a deadlock!");
> }
> }
> }
> {code}
>
> Originally this deadlock seems to occur, when agroal tries to flush a connection due to the config parameter
> <property name="hibernate.agroal.maxLifetime_m">60</property>
> If at the same time another thread using this connection calls session.close there is a possibility in the ConnectionPool.class getting called twice. The parameter goes through the following path:
> !image-2020-07-31-08-39-28-630.png!
>
> The parallel session.close call does not find a checked_out connection and tries to flush it instead, hence two Threads are getting into the deadlock situation:
> !image-2020-07-31-08-40-41-968.png!
>
> Kind regards,
> Rene
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
More information about the jboss-jira
mailing list