I finally got to re-enable MariaDB and PostgreSQL based tests for
Hibernate Search - which has been running on H2 only for some months -
and had to debug a case of a single test hanging for a long time.
Essentially it would block - for hours - on the SessionFactory#close()
method, attempting to drop the database schema with the following
statement:
alter table AncientBook_alternativeTitles drop constraint
FKn8hhkmhof1mdgc4oi77ccq989
Dumping threads I would get a very similar stack trace on both
databases; initially I thought someone had copy/pasted a socket
handling bug from one JDBC driver to the other ;-)
The PostgreSQL testsuite hanging:
"main" prio=10 tid=0x00007f0f40009000 nid=0x5f7c runnable [0x00007f0f48956000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at
org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:145)
at
org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:114)
at
org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:73)
at org.postgresql.core.PGStream.ReceiveChar(PGStream.java:274)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1660)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
- locked <0x00000007c11e3860> (a org.postgresql.core.v3.QueryExecutorImpl)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
at
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:374)
at
org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:302)
at org.hibernate.tool.hbm2ddl.DatabaseExporter.export(DatabaseExporter.java:47)
at org.hibernate.tool.hbm2ddl.SchemaExport.perform(SchemaExport.java:476)
at org.hibernate.tool.hbm2ddl.SchemaExport.execute(SchemaExport.java:430)
at org.hibernate.tool.hbm2ddl.SchemaExport.drop(SchemaExport.java:375)
at org.hibernate.tool.hbm2ddl.SchemaExport.drop(SchemaExport.java:371)
at org.hibernate.internal.SessionFactoryImpl.close(SessionFactoryImpl.java:1069)
at
org.hibernate.search.test.util.FullTextSessionBuilder.close(FullTextSessionBuilder.java:149)
at
org.hibernate.search.test.util.FullTextSessionBuilder$1.evaluate(FullTextSessionBuilder.java:248)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
The MariaDB testsuite hanging:
"main" prio=10 tid=0x00007f8ca0009000 nid=0x4043 runnable [0x00007f8ca5f5c000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:114)
at
com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:161)
at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:189)
- locked <0x00000007c0baa518> (a com.mysql.jdbc.util.ReadAheadInputStream)
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:2499)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2952)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2941)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3489)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1959)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2113)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2562)
- locked <0x00000007c0baa850> (a java.lang.Object)
at com.mysql.jdbc.StatementImpl.executeUpdate(StatementImpl.java:1664)
- locked <0x00000007c0baa850> (a java.lang.Object)
at com.mysql.jdbc.StatementImpl.executeUpdate(StatementImpl.java:1583)
at org.hibernate.tool.hbm2ddl.DatabaseExporter.export(DatabaseExporter.java:47)
at org.hibernate.tool.hbm2ddl.SchemaExport.perform(SchemaExport.java:476)
at org.hibernate.tool.hbm2ddl.SchemaExport.execute(SchemaExport.java:430)
at org.hibernate.tool.hbm2ddl.SchemaExport.drop(SchemaExport.java:375)
at org.hibernate.tool.hbm2ddl.SchemaExport.drop(SchemaExport.java:371)
at org.hibernate.internal.SessionFactoryImpl.close(SessionFactoryImpl.java:1069)
at
org.hibernate.search.test.util.FullTextSessionBuilder.close(FullTextSessionBuilder.java:149)
at
org.hibernate.search.test.util.FullTextSessionBuilder$1.evaluate(FullTextSessionBuilder.java:248)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
On Gail's suggestion I went looking for database level locks; I hadn't
thought of that as I assumed it would have timed out more aggressively
rather than have me wait for hours.
It turns out she was right and the reason for blocking was a simple
"count *" query being run as a post-test assertion, which we had
forgotten to wrap in an "open transaction & commit" statements pair.
The test assertion would be successful, but apparently it would hold
on the table lock beyond closing the Session and, then fail to drop
the database at the teardown of the test.
I'm wondering if this is expected, or if there's something in the new
transaction handling code which could be improved? It took me several
hours to figure this out; maybe I'm just not using ORM as frequently
as once :)
If it's this critical to have the transaction, maybe it should be mandatory?
And as a memo for next time, this is the query to figure out details
about locks on our testing db:
SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid
= psa.pid where datname = 'testingdb';
Thanks,
Sanne