[jboss-jira] [JBoss JIRA] (AS7-5967) RemoteNamingTestCase intermittently fails on IBM JDK due to org.jboss.remoting3.NotOpenException: Endpoint is not open

Brian Stansberry (JIRA) jira-events at lists.jboss.org
Wed Nov 21 21:52:21 EST 2012


    [ https://issues.jboss.org/browse/AS7-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736227#comment-12736227 ] 

Brian Stansberry edited comment on AS7-5967 at 11/21/12 9:51 PM:
-----------------------------------------------------------------

When this fails, is it ever the first test executed that fails, or is it always the 2nd or third?

I'm concerned this patch is covering over a race condition.

If you look at your trace logging, you can see that the DEBUG stack trace shows the javax.naming.InitialContext has a ref to the RemoteContext. The test execution stack has a reference to the InitialContext, even without your patch. This is sufficient to keep the RemoteContext from being gc'd, unless the JVM is fundamentally broken (unlikely in the extreme.) 

The patch creating a local variable reference to the RemoteContext is just another form of reference on the stack and shouldn't be preventing the RemoteContext being gc'd, since it shouldn't be gc'd anyway.

I suspect what the patch is really doing that solves the intermittent failure is calling context.close(). That means each test method is cleaning up the context. I'm concerned what was happening before is a race condition. Test method #1 uses a RemoteContext, and then it gets closed by the gc. There's a race with the next test method #2 creating a new RemoteContext using the same underlying endpoint (which is cached and would be reused by the 2nd test if the 1st hasn't cleaned up yet.) Looking at the code in RemoteContext EndpointCache I don't see how this could happen though.
                
      was (Author: brian.stansberry):
    When this fails, is it ever the first test executed that fails, or is it always the 2nd or third?

I'm concerned this patch is covering over a race condition.

If you look at your trace logging, you can see that the DEBUG stack trace shows the javax.naming.InitialContext has a ref to the RemoteContext. The test execution stack will also have a reference, even without your patch. Either one of these references is sufficient to keep the RemoteContext from being gc'd, unless the JVM is fundamentally broken (unlikely in the extreme.) 

The patch creating a local variable reference to the RemoteContext is just another form of reference on the stack and shouldn't be preventing the RemoteContext being gc'd, since it shouldn't be gc'd anyway.

I suspect what the patch is really doing that solves the intermittent failure is calling context.close(). That means each test method is cleaning up the context. I'm concerned what was happening before is a race condition. Test method #1 uses a RemoteContext, and then it gets closed by the gc. There's a race with the next test method #2 creating a new RemoteContext using the same underlying endpoint (which is cached and would be reused by the 2nd test if the 1st hasn't cleaned up yet.) Looking at the code in RemoteContext EndpointCache I don't see how this could happen though.
                  
> RemoteNamingTestCase intermittently fails on IBM JDK due to org.jboss.remoting3.NotOpenException: Endpoint is not open
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AS7-5967
>                 URL: https://issues.jboss.org/browse/AS7-5967
>             Project: Application Server 7
>          Issue Type: Feature Request
>          Components: Naming, Test Suite
>    Affects Versions: 7.1.3.Final (EAP), 7.2.0.Alpha1
>         Environment: IBM JDK 6
> IBM JDK 7
>            Reporter: Ivo Studensky
>            Assignee: Ivo Studensky
>
> RemoteNamingTestCase intermittently fails when running on IBM JDK. According to logs the remoting channel had been closed before the endpoint tried to connect to it. Unfortunately, when I was trying to debug this issue the tests always nicely passed.
> test.log snippet:
> {noformat}
> 13:16:31,115 DEBUG [org.xnio.nio] (Remoting "config-based-naming-client-endpoint" read-1) Started channel thread 'Remoting "config-based-naming-client-endpoint" read-1', selector sun.nio.ch.EPollSelectorImpl at 345642e1
> 13:16:31,115 DEBUG [org.xnio.nio] (Remoting "config-based-naming-client-endpoint" write-1) Started channel thread 'Remoting "config-based-naming-client-endpoint" write-1', selector sun.nio.ch.EPollSelectorImpl at 1dc68cf2
> 13:16:31,121 DEBUG [org.jboss.naming.remote.client.InitialContextFactory] (main) jboss.naming.client.connect.options. has the following options {org.xnio.Options.SASL_POLICY_NOPLAINTEXT=>false}
> 13:16:31,191 ERROR [org.jboss.naming.remote.protocol.v1.RemoteNamingStoreV1] (Remoting "config-based-naming-client-endpoint" task-1) Channel end notification received, closing channel Channel ID d1f17196 (outbound) of Remoting connection fd3dcedc to /127.0.0.1:4447
> 13:16:31,204 DEBUG [org.jboss.naming.remote.client.HaRemoteNamingStore] (main) Failed to connect to server remote://127.0.0.1:4447: org.jboss.remoting3.NotOpenException: Endpoint is not open
> 	at org.jboss.remoting3.EndpointImpl.resourceUntick(EndpointImpl.java:182)
> 	at org.jboss.remoting3.EndpointImpl.doConnect(EndpointImpl.java:261)
> 	at org.jboss.remoting3.EndpointImpl.doConnect(EndpointImpl.java:251)
> 	at org.jboss.remoting3.EndpointImpl.connect(EndpointImpl.java:349)
> 	at org.jboss.remoting3.EndpointImpl.connect(EndpointImpl.java:333)
> 	at org.jboss.naming.remote.client.EndpointCache$EndpointWrapper.connect(EndpointCache.java:105)
> 	at org.jboss.naming.remote.client.HaRemoteNamingStore.failOverSequence(HaRemoteNamingStore.java:179)
> 	at org.jboss.naming.remote.client.HaRemoteNamingStore.namingOperation(HaRemoteNamingStore.java:117)
> 	at org.jboss.naming.remote.client.HaRemoteNamingStore.lookup(HaRemoteNamingStore.java:223)
> 	at org.jboss.naming.remote.client.RemoteContext.lookup(RemoteContext.java:79)
> 	at org.jboss.naming.remote.client.RemoteContext.lookup(RemoteContext.java:83)
> 	at javax.naming.InitialContext.lookup(InitialContext.java:422)
> 	at org.jboss.as.test.integration.naming.remote.simple.RemoteNamingTestCase.testRemoteLookup(RemoteNamingTestCase.java:74)
> {noformat}
> server.log snippet:
> {noformat}
> 13:16:31,025 INFO  [org.jboss.as.server] (management-handler-thread - 3) JBAS018559: Deployed "test.jar"
> 13:16:31,163 DEBUG [org.jboss.naming.remote.server.RemoteNamingService] (Remoting "thinkpax" task-3) Channel Opened - Channel ID 51f17196 (inbound) of Remoting connection b9da2788 to /127.0.0.1:46866
> 13:16:31,176 DEBUG [org.jboss.naming.remote.server.RemoteNamingService] (Remoting "thinkpax" task-4) Chosen version 0x01
> 13:16:31,189 DEBUG [org.jboss.naming.remote.server.RemoteNamingService] (Remoting "thinkpax" read-1) Channel Channel ID 51f17196 (inbound) of Remoting connection b9da2788 to /127.0.0.1:46866 closed.
> 13:16:31,193 INFO  [org.jboss.as.naming] (Remoting "thinkpax" task-1) JBAS011806: Channel end notification received, closing channel Channel ID 51f17196 (inbound) of Remoting connection b9da2788 to null
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the jboss-jira mailing list