]
Gustavo Fernandes updated ISPN-12558:
-------------------------------------
Status: Resolved (was: Pull Request Sent)
Fix Version/s: 12.0.0.CR1
Resolution: Done
StackOverflowError during REST connection shutdown
--------------------------------------------------
Key: ISPN-12558
URL:
https://issues.redhat.com/browse/ISPN-12558
Project: Infinispan
Issue Type: Bug
Reporter: Dan Berindei
Assignee: Dan Berindei
Priority: Major
Labels: testsuite_stability
Fix For: 12.0.0.CR1
When the REST server accepts a HTTP2 SSL connection, it initializes a {{OpenSSLEngine}}
provided by {{wildfly-openssl-java}}:
{noformat}
at org.wildfly.openssl.OpenSSLEngine.initSsl(OpenSSLEngine.java:188)
at org.wildfly.openssl.OpenSSLEngine.unwrap(OpenSSLEngine.java:532)
at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:637)
at io.netty.handler.ssl.JdkSslEngine.unwrap(JdkSslEngine.java:92)
at io.netty.handler.ssl.JdkAlpnSslEngine.unwrap(JdkAlpnSslEngine.java:143)
at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:282)
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1380)
at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1275)
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1322)
at
io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501)
at
io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:440)
at
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at
io.netty.handler.codec.ByteToMessageDecoder.handlerRemoved(ByteToMessageDecoder.java:253)
at
io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:508)
at
io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:440)
at
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at
io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:93)
at
org.infinispan.server.core.transport.StatsChannelHandler.channelRead(StatsChannelHandler.java:26)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
at
io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:792)
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:475)
{noformat}
When the client closes the connection, is sends a {{close_notify}} message, and
{{org.wildfly.openssl.OpenSSLEngine.unwrap()}} automatically shuts down the engine:
{noformat}
at org.wildfly.openssl.OpenSSLEngine.shutdown(OpenSSLEngine.java:205)
at org.wildfly.openssl.OpenSSLEngine.closeInbound(OpenSSLEngine.java:692)
at org.wildfly.openssl.OpenSSLEngine.unwrap(OpenSSLEngine.java:657)
at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:637)
at io.netty.handler.ssl.JdkSslEngine.unwrap(JdkSslEngine.java:92)
at io.netty.handler.ssl.JdkAlpnSslEngine.unwrap(JdkAlpnSslEngine.java:143)
at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:282)
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1380)
at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1275)
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1322)
at
io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501)
at
io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:440)
at
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at
io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:93)
at
org.infinispan.server.core.transport.StatsChannelHandler.channelRead(StatsChannelHandler.java:26)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
at
io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:792)
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:475)
{noformat}
But then {{Http2MultiplexCodec}} also wants to flush its output buffer (which happens to
be an {{EmptyByteBuf}}).
It cannot flush because the {{OpenSSLEngine}} instance is no longer valid:
{noformat}
java.lang.IllegalStateException: ssl is null
at org.wildfly.openssl.SSLImpl.getSessionId0(Native Method)
at org.wildfly.openssl.SSLImpl.getSessionId(SSLImpl.java:494)
at org.wildfly.openssl.OpenSSLEngine.getSession(OpenSSLEngine.java:977)
at io.netty.handler.ssl.JdkSslEngine.getSession(JdkSslEngine.java:48)
at
io.netty.handler.ssl.SslHandler$SslEngineType$3.allocateWrapBuffer(SslHandler.java:312)
at io.netty.handler.ssl.SslHandler.allocateOutNetBuf(SslHandler.java:2207)
at io.netty.handler.ssl.SslHandler.wrap(SslHandler.java:840)
at io.netty.handler.ssl.SslHandler.wrapAndFlush(SslHandler.java:811)
at io.netty.handler.ssl.SslHandler.flush(SslHandler.java:792)
at
io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:750)
at
io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:742)
at
io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:728)
at
io.netty.handler.codec.http2.Http2ConnectionHandler.flush(Http2ConnectionHandler.java:189)
at
io.netty.handler.codec.http2.Http2MultiplexCodec.flush0(Http2MultiplexCodec.java:282)
at
io.netty.handler.codec.http2.Http2MultiplexCodec.processPendingReadCompleteQueue(Http2MultiplexCodec.java:261)
at
io.netty.handler.codec.http2.Http2MultiplexCodec.channelReadComplete(Http2MultiplexCodec.java:240)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelReadComplete(AbstractChannelHandlerContext.java:410)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelReadComplete(AbstractChannelHandlerContext.java:397)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelReadComplete(AbstractChannelHandlerContext.java:390)
at io.netty.handler.ssl.SslHandler.channelReadComplete0(SslHandler.java:1341)
at io.netty.handler.ssl.SslHandler.channelReadComplete(SslHandler.java:1330)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelReadComplete(AbstractChannelHandlerContext.java:410)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelReadComplete(AbstractChannelHandlerContext.java:397)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelReadComplete(AbstractChannelHandlerContext.java:390)
at
io.netty.channel.DefaultChannelPipeline$HeadContext.channelReadComplete(DefaultChannelPipeline.java:1415)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelReadComplete(AbstractChannelHandlerContext.java:410)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelReadComplete(AbstractChannelHandlerContext.java:397)
at
io.netty.channel.DefaultChannelPipeline.fireChannelReadComplete(DefaultChannelPipeline.java:925)
at
io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:812)
{noformat}
Our {{RestRequestHandler}} catches the exception, logs it, then wants to close the
connection, which requires a flush.
The flush throws an {{IllegalStateException}}, which {{RestRequestHandler}} catches
again, and the whole thing is repeated until eventually it leads to a
{{StackOverflowError}}.
After the first {{StackOverflowError}}, some new classes are loaded. The test suite has a
BlockHound instrumentation agent (using ByteBuddy internally), and when they also throw
{{StackOverflowError}}, the JVM logs an assertion failure message:
{noformat}
[ERROR] *** java.lang.instrument ASSERTION FAILED ***: "!errorOutstanding" with
message transform method call failed at
./src/java.instrument/share/native/libinstrument/JPLISAgent.c line: 873
{noformat}
The problem can always be reproduced by running
{{org.infinispan.rest.resources.CacheV2ResourceTest.testSearchStatistics[security=true,
protocol=HTTP_20, ssl=true]}}. In CI, it sometimes hangs the build (not even the long test
5 minute timeout works).