[hibernate-dev] Missing transaction hangs the testsuite
Steve Ebersole
steve at hibernate.org
Fri Jul 17 18:04:01 EDT 2015
So this is definitely not limited to 5.0. This happens in 4.3 as well. I
pointed 4.3 locally to my MariaDB instance and it is hanging. In every
case I have seen so far, this is problems in the test code though.
On Fri, Jul 17, 2015 at 1:33 PM Steve Ebersole <steve at hibernate.org> wrote:
> Here is something else interesting...
>
> So at the point that the test hang against MySQL for me, I have:
>
> MariaDB [(none)]> SHOW FULL processlist;
>
> +-----+--------------------+-----------------+--------------------+---------+------+---------------------------------+--------------------------------+----------+
> | Id | User | Host | db |
> Command | Time | State | Info
> | Progress |
>
> +-----+--------------------+-----------------+--------------------+---------+------+---------------------------------+--------------------------------+----------+
> | 138 | root | localhost | NULL | Query
> | 0 | init | SHOW FULL processlist
> | 0.000 |
> | 139 | hibernate_orm_test | localhost:52853 | hibernate_orm_test | Sleep
> | 12 | | NULL
> | 0.000 |
> | 140 | hibernate_orm_test | localhost:52854 | hibernate_orm_test | Query
> | 12 | Waiting for table metadata lock | drop table if exists TheEntity
> | 0.000 |
>
> +-----+--------------------+-----------------+--------------------+---------+------+---------------------------------+--------------------------------+----------+
>
> So we see 2 JDBC connections with process ids 139 and 140. You can see
> 139 blocked waiting to be able to drop a table; specifically waiting on a
> lock of some sort.
>
> However, `SHOW OPEN TABLES;` shows no tables in use and no tables locked.
>
> Unfortunately that is the extent of my "MySQL debugging skills".
>
>
> On Fri, Jul 17, 2015 at 11:14 AM Steve Ebersole <steve at hibernate.org>
> wrote:
>
>> Process != test. Process here == connection. You have multiple.
>>
>> On Fri, Jul 17, 2015, 11:12 AM Sanne Grinovero <sanne at hibernate.org>
>> wrote:
>>
>>> On 17 July 2015 at 16:25, Steve Ebersole <steve at hibernate.org> wrote:
>>> > It would seem that this is a regression with ORM, I agree. However,
>>> we do
>>> > not know how recent of a regression. And that is the problem. We
>>> have no
>>> > idea how far back to look. Because the ORM CI jobs against MySQL,
>>> > PostrgreSQL, Oracle, etc have not been running for many months. IIRC
>>> even
>>> > when they were running on the old CI servers we had problems with
>>> those jobs
>>> > that were not looked at for some time.
>>>
>>> Sure, same problem here: not run these tests for months, and you're
>>> right I could do a quick check on a slightly older branch which is
>>> using ORM 4.3 and I have the same problem.
>>>
>>> > As far as what you see, I am not convinced it is what you think it is
>>> in
>>> > terms of auto-commit for queries run outside of a transaction. This
>>> part
>>> > has not changed in 5.0. Assuming you are using the "built in
>>> Connection
>>> > pool", there is a setting that controls the auto-commit mode of the
>>> > connections we create (hibernate.connection.autocommit). That setting
>>> has
>>> > always been there, and it has always defaulted to false. This has not
>>> > changed. In fact the same exact code is used in 4.3 and 5.0 for
>>> creating
>>> > these Connections:
>>> >
>>> org.hibernate.engine.jdbc.connections.internal.BasicConnectionCreator. The
>>> > autoCommit value in there comes from the setting I just mentioned. The
>>> > logical outcome is the same even before 4.3, just that it was done
>>> > differently. So bottom line, if you are not seeing auto-commit on
>>> these
>>> > Connections now, you never were. Now it is possible that their are
>>> some
>>> > ancillary changes that cause the lack of auto-commit to now be an issue
>>> > where it was not before. So first, I'd need to know whether the
>>> Connection
>>> > here is auto-commit or is not. From there we can start drilling down
>>> on
>>> > what might be a cause. But keep in mind too, that with hangs and
>>> deadlocks
>>> > the process that hangs is often not the culprit; often it is the
>>> previous
>>> > process that is holding locks that is the culprit.
>>>
>>> I understand, but in this specific case I can reproduce the issue by
>>> running a single test.
>>> Also that psql SQL I shared clearly shows which statement is causing it
>>> to hang,
>>> and I could fix the test by just wrapping that statement in a
>>> transaction.
>>>
>>> >
>>> > My case is actually different, at least on the surface. Mine happens
>>> on
>>> > dropping tables after the test is complete (for MySQL anyway). Which
>>> got me
>>> > to looking at schema export and friends and how they handle
>>> connections.
>>> > They handle connections quite differently. Again the same as they
>>> used to,
>>> > but I mean very different from how sessions manage connections. In
>>> schema
>>> > export the connections are always set to auto-commit mode at the start
>>> of
>>> > processing. Then they never do anything with auto-commit afterwards.
>>> I am
>>> > not sure this is right.
>>>
>>> Actually it sounds like you're having the same problem. I debugged the
>>> PosgreSQL tests first, but I had the same on MySQL (as in the
>>> stacktrace I shared),
>>> and the test also hangs on schema drop.
>>> The problem is that the schema can't be dropped because a previous
>>> statement - from a session which is now correctly closed - is still
>>> holding locks.
>>>
>>> Thanks,
>>> Sanne
>>>
>>> >
>>> >
>>> >
>>> >
>>> > On Fri, Jul 17, 2015 at 3:51 AM Sanne Grinovero <sanne at hibernate.org>
>>> wrote:
>>> >>
>>> >> Thanks Steve, I wasn't sure if I had found a problem.
>>> >> Still, it looks like the Hibernate Search tests are not benefiting
>>> >> from auto-commit, or I wouldn't have this problem right? Our testing
>>> >> configuration properties have been the same for years and I'm just
>>> >> re-enabling tests which haven't been run for a while, so I suspect
>>> >> something changed in ORM.
>>> >>
>>> >> I think I'd expect Hibernate ORM to automatically enable auto-commit
>>> >> on operations not running within a transaction, to make sure all locks
>>> >> are released when the Session is closed.
>>> >>
>>> >> Thanks,
>>> >> Sanne
>>> >>
>>> >> On 17 July 2015 at 00:14, Steve Ebersole <steve at hibernate.org> wrote:
>>> >> > For what it is worth, I am running into the same problems on MySQL
>>> for
>>> >> > the
>>> >> > ORM tests in terms of blocks on dropping tables. PGSQL at least
>>> >> > finishes,
>>> >> > although I run into 55 errors which I still need to triage.
>>> >> >
>>> >> > On Thu, Jul 16, 2015 at 5:59 PM Steve Ebersole <steve at hibernate.org
>>> >
>>> >> > wrote:
>>> >> >>
>>> >> >> There is really nothing to improve. What would you suggest as an
>>> >> >> improvement? I mean from a ORM perspective, we have an ato-commit
>>> >> >> Connection (presumably) and run a query. How on earth would we
>>> know to
>>> >> >> expect that the database is holding locks for that?
>>> >> >>
>>> >> >> As far as "requiring transactions" I have tried to fly that plane
>>> >> >> multiple
>>> >> >> times in the past. I think it is absolutely ridiculous to allow
>>> these
>>> >> >> auto-commit use cases. In fact I personally wanted to take it to
>>> the
>>> >> >> more
>>> >> >> extreme case of always using JTA and simply mimicking "single
>>> resource
>>> >> >> JTA
>>> >> >> transactions" internally in the "JDBC transaction" case. Gavin
>>> and I
>>> >> >> both
>>> >> >> did in fact. But we got a lot of push back.
>>> >> >>
>>> >> >> So I am not sure exactly what the improvement here might be. What
>>> is
>>> >> >> it
>>> >> >> you were thinking specifically?
>>> >> >>
>>> >> >> On Thu, Jul 16, 2015 at 5:36 PM Sanne Grinovero <
>>> sanne at hibernate.org>
>>> >>
>>> >> >> wrote:
>>> >> >>>
>>> >> >>> I finally got to re-enable MariaDB and PostgreSQL based tests for
>>> >> >>> Hibernate Search - which has been running on H2 only for some
>>> months -
>>> >> >>> and had to debug a case of a single test hanging for a long time.
>>> >> >>>
>>> >> >>> Essentially it would block - for hours - on the
>>> SessionFactory#close()
>>> >> >>> method, attempting to drop the database schema with the following
>>> >> >>> statement:
>>> >> >>> > alter table AncientBook_alternativeTitles drop constraint
>>> >> >>> FKn8hhkmhof1mdgc4oi77ccq989
>>> >> >>>
>>> >> >>> Dumping threads I would get a very similar stack trace on both
>>> >> >>> databases; initially I thought someone had copy/pasted a socket
>>> >> >>> handling bug from one JDBC driver to the other ;-)
>>> >> >>>
>>> >> >>> The PostgreSQL testsuite hanging:
>>> >> >>>
>>> >> >>> "main" prio=10 tid=0x00007f0f40009000 nid=0x5f7c runnable
>>> >> >>> [0x00007f0f48956000]
>>> >> >>> java.lang.Thread.State: RUNNABLE
>>> >> >>> at java.net.SocketInputStream.socketRead0(Native Method)
>>> >> >>> at java.net.SocketInputStream.read(SocketInputStream.java:152)
>>> >> >>> at java.net.SocketInputStream.read(SocketInputStream.java:122)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:145)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:114)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:73)
>>> >> >>> at org.postgresql.core.PGStream.ReceiveChar(PGStream.java:274)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1660)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
>>> >> >>> - locked <0x00000007c11e3860> (a
>>> >> >>> org.postgresql.core.v3.QueryExecutorImpl)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:374)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:302)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.hibernate.tool.hbm2ddl.DatabaseExporter.export(DatabaseExporter.java:47)
>>> >> >>> at
>>> >> >>>
>>> org.hibernate.tool.hbm2ddl.SchemaExport.perform(SchemaExport.java:476)
>>> >> >>> at
>>> >> >>>
>>> org.hibernate.tool.hbm2ddl.SchemaExport.execute(SchemaExport.java:430)
>>> >> >>> at
>>> org.hibernate.tool.hbm2ddl.SchemaExport.drop(SchemaExport.java:375)
>>> >> >>> at
>>> org.hibernate.tool.hbm2ddl.SchemaExport.drop(SchemaExport.java:371)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.hibernate.internal.SessionFactoryImpl.close(SessionFactoryImpl.java:1069)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.hibernate.search.test.util.FullTextSessionBuilder.close(FullTextSessionBuilder.java:149)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.hibernate.search.test.util.FullTextSessionBuilder$1.evaluate(FullTextSessionBuilder.java:248)
>>> >> >>> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>>> >> >>> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
>>> >> >>>
>>> >> >>> The MariaDB testsuite hanging:
>>> >> >>>
>>> >> >>> "main" prio=10 tid=0x00007f8ca0009000 nid=0x4043 runnable
>>> >> >>> [0x00007f8ca5f5c000]
>>> >> >>> java.lang.Thread.State: RUNNABLE
>>> >> >>> at java.net.SocketInputStream.socketRead0(Native Method)
>>> >> >>> at java.net.SocketInputStream.read(SocketInputStream.java:152)
>>> >> >>> at java.net.SocketInputStream.read(SocketInputStream.java:122)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:114)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:161)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:189)
>>> >> >>> - locked <0x00000007c0baa518> (a
>>> >> >>> com.mysql.jdbc.util.ReadAheadInputStream)
>>> >> >>> at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:2499)
>>> >> >>> at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2952)
>>> >> >>> at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2941)
>>> >> >>> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3489)
>>> >> >>> at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1959)
>>> >> >>> at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2113)
>>> >> >>> at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2562)
>>> >> >>> - locked <0x00000007c0baa850> (a java.lang.Object)
>>> >> >>> at
>>> com.mysql.jdbc.StatementImpl.executeUpdate(StatementImpl.java:1664)
>>> >> >>> - locked <0x00000007c0baa850> (a java.lang.Object)
>>> >> >>> at
>>> com.mysql.jdbc.StatementImpl.executeUpdate(StatementImpl.java:1583)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.hibernate.tool.hbm2ddl.DatabaseExporter.export(DatabaseExporter.java:47)
>>> >> >>> at
>>> >> >>>
>>> org.hibernate.tool.hbm2ddl.SchemaExport.perform(SchemaExport.java:476)
>>> >> >>> at
>>> >> >>>
>>> org.hibernate.tool.hbm2ddl.SchemaExport.execute(SchemaExport.java:430)
>>> >> >>> at
>>> org.hibernate.tool.hbm2ddl.SchemaExport.drop(SchemaExport.java:375)
>>> >> >>> at
>>> org.hibernate.tool.hbm2ddl.SchemaExport.drop(SchemaExport.java:371)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.hibernate.internal.SessionFactoryImpl.close(SessionFactoryImpl.java:1069)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.hibernate.search.test.util.FullTextSessionBuilder.close(FullTextSessionBuilder.java:149)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.hibernate.search.test.util.FullTextSessionBuilder$1.evaluate(FullTextSessionBuilder.java:248)
>>> >> >>> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>>> >> >>> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
>>> >> >>> at
>>> >> >>>
>>> >> >>>
>>> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
>>> >> >>>
>>> >> >>> On Gail's suggestion I went looking for database level locks; I
>>> hadn't
>>> >> >>> thought of that as I assumed it would have timed out more
>>> aggressively
>>> >> >>> rather than have me wait for hours.
>>> >> >>> It turns out she was right and the reason for blocking was a
>>> simple
>>> >> >>> "count *" query being run as a post-test assertion, which we had
>>> >> >>> forgotten to wrap in an "open transaction & commit" statements
>>> pair.
>>> >> >>> The test assertion would be successful, but apparently it would
>>> hold
>>> >> >>> on the table lock beyond closing the Session and, then fail to
>>> drop
>>> >> >>> the database at the teardown of the test.
>>> >> >>>
>>> >> >>> I'm wondering if this is expected, or if there's something in the
>>> new
>>> >> >>> transaction handling code which could be improved? It took me
>>> several
>>> >> >>> hours to figure this out; maybe I'm just not using ORM as
>>> frequently
>>> >> >>> as once :)
>>> >> >>> If it's this critical to have the transaction, maybe it should be
>>> >> >>> mandatory?
>>> >> >>>
>>> >> >>> And as a memo for next time, this is the query to figure out
>>> details
>>> >> >>> about locks on our testing db:
>>> >> >>> > SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON
>>> pl.pid
>>> >> >>> = psa.pid where datname = 'testingdb';
>>> >> >>>
>>> >> >>> Thanks,
>>> >> >>> Sanne
>>> >> >>> _______________________________________________
>>> >> >>> hibernate-dev mailing list
>>> >> >>> hibernate-dev at lists.jboss.org
>>> >> >>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>>>
>>
More information about the hibernate-dev
mailing list