[JBoss JIRA] (WFLY-9985) NPE in server log when Artemis trace logging is enabled
by Miroslav Novak (JIRA)
[ https://issues.jboss.org/browse/WFLY-9985?page=com.atlassian.jira.plugin.... ]
Miroslav Novak commented on WFLY-9985:
--------------------------------------
This issue is causing OutOfMemoryError for "Too many open files" in our tests. There is leak on open files as large message files does not close properly when this happens.
> NPE in server log when Artemis trace logging is enabled
> -------------------------------------------------------
>
> Key: WFLY-9985
> URL: https://issues.jboss.org/browse/WFLY-9985
> Project: WildFly
> Issue Type: Bug
> Components: JMS
> Reporter: Miroslav Novak
> Assignee: Jeff Mesnil
>
> Artemis master (95b7438e7a7661692d5b78be944d05e254df9067) contains issue when trace logging is enabled.
> If large message is sent and Artemis trace logs are enabled then following NPE is logged in server log:
> {code}
> 09:42:14,005 WARN [org.apache.activemq.artemis.core.message.impl.CoreMessage] (default I/O-9) Error creating String for message: : java.lang.NullPointerException
> at org.apache.activemq.artemis.core.message.impl.CoreMessage.encode(CoreMessage.java:584)
> at org.apache.activemq.artemis.core.message.impl.CoreMessage.checkEncode(CoreMessage.java:248)
> at org.apache.activemq.artemis.core.message.impl.CoreMessage.getEncodeSize(CoreMessage.java:647)
> at org.apache.activemq.artemis.core.message.impl.CoreMessage.getPersistentSize(CoreMessage.java:1157)
> at org.apache.activemq.artemis.core.message.impl.CoreMessage.toString(CoreMessage.java:1132)
> at java.lang.String.valueOf(String.java:2994) [rt.jar:1.8.0_131]
> at java.lang.StringBuilder.append(StringBuilder.java:131) [rt.jar:1.8.0_131]
> at org.apache.activemq.artemis.core.protocol.core.impl.wireformat.SessionSendLargeMessage.toString(SessionSendLargeMessage.java:73)
> at java.lang.String.valueOf(String.java:2994) [rt.jar:1.8.0_131]
> at java.lang.StringBuilder.append(StringBuilder.java:131) [rt.jar:1.8.0_131]
> at org.apache.activemq.artemis.core.protocol.core.impl.RemotingConnectionImpl.bufferReceived(RemotingConnectionImpl.java:368)
> at org.apache.activemq.artemis.core.remoting.server.impl.RemotingServiceImpl$DelegatingBufferHandler.bufferReceived(RemotingServiceImpl.java:646)
> at org.apache.activemq.artemis.core.remoting.impl.netty.ActiveMQChannelHandler.channelRead(ActiveMQChannelHandler.java:68)
> at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
> at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
> at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
> at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
> at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:297)
> at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:413)
> at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265)
> at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
> at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
> at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
> at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359)
> at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
> at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
> at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935)
> at org.xnio.netty.transport.AbstractXnioSocketChannel$ReadListener.handleEvent(AbstractXnioSocketChannel.java:443)
> at org.xnio.netty.transport.AbstractXnioSocketChannel$ReadListener.handleEvent(AbstractXnioSocketChannel.java:379)
> at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:92) [xnio-api-3.6.1.Final.jar:3.6.1.Final]
> at org.xnio.conduits.ReadReadyHandler$ChannelListenerHandler.readReady(ReadReadyHandler.java:66) [xnio-api-3.6.1.Final.jar:3.6.1.Final]
> at org.xnio.nio.NioSocketConduit.handleReady(NioSocketConduit.java:89) [xnio-nio-3.6.1.Final.jar:3.6.1.Final]
> at org.xnio.nio.WorkerThread.run(WorkerThread.java:591) [xnio-nio-3.6.1.Final.jar:3.6.1.Final]
> {code}
> Currently it appears that it has not impact on functionality but NPEs are flooding server log.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
8 years, 3 months
[JBoss JIRA] (WFLY-9890) "ISPN000476: Timed out waiting for responses for request X from node Y" immediately after node Y rejoins the cluster (failover)
by Stuart Douglas (JIRA)
[ https://issues.jboss.org/browse/WFLY-9890?page=com.atlassian.jira.plugin.... ]
Stuart Douglas commented on WFLY-9890:
--------------------------------------
There was also a failure on a master-ignore job that may be related: https://ci.wildfly.org/viewLog.html?buildId=94131&tab=buildResultsDiv&bui...
> "ISPN000476: Timed out waiting for responses for request X from node Y" immediately after node Y rejoins the cluster (failover)
> -------------------------------------------------------------------------------------------------------------------------------
>
> Key: WFLY-9890
> URL: https://issues.jboss.org/browse/WFLY-9890
> Project: WildFly
> Issue Type: Bug
> Components: Clustering
> Affects Versions: 12.0.0.Beta1
> Reporter: Michal Vinkler
> Assignee: Paul Ferraro
> Priority: Critical
> Fix For: 13.0.0.Beta1
>
>
> Seen in SF failover tests - scenarios:
> failover-http-granular-shutdown-repl-sync (replication granularity is set to ATTRIBUTE level)
> failover-http-session-jvmkill-repl-sync (replication granularity is set to SESSION level)
> Description: In the failover test, each server is shut down and after some time it rejoins the cluster. After joining the cluster again, the load is distributed to this node. It seems that first request for some session handled by that node produces ERROR in the log of some other server until the time another cluster topology change occurs.
> Stacktrace of the error:
> {code}
> [JBossINF] [0m[31m10:18:00,256 ERROR [org.infinispan.interceptors.impl.InvocationContextInterceptor] (timeout-thread--p12-t1) ISPN000136: Error executing command PrepareCommand, writing keys [SessionCreationMetaDataKey(f25_wwXXpvtNC9QTHZUtSwhmEUmg-wI77RL_17b9), SessionAccessMetaDataKey(f25_wwXXpvtNC9QTHZUtSwhmEUmg-wI77RL_17b9), SessionAttributeKey(f25_wwXXpvtNC9QTHZUtSwhmEUmg-wI77RL_17b9[1])]: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 30250 from perf18
> [JBossINF] at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:163)
> [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:86)
> [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:21)
> [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [JBossINF] at java.lang.Thread.run(Thread.java:748)
> {code}
> Client gets the stale data:
> {code}
> 2018/02/21 10:18:00:272 EST [WARN ][Runner - 16] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Error sampling data: <org.jboss.smartfrog.loaddriver.RequestProcessingException: Stale session data received. Expected 60, received 59, Runner: 16>
> org.jboss.smartfrog.loaddriver.RequestProcessingException: Stale session data received. Expected 60, received 59, Runner: 16
> at org.jboss.smartfrog.loaddriver.http.AbstractSerialNumberValidatorFactoryImpl$SerialNumberValidator.processRequest(AbstractSerialNumberValidatorFactoryImpl.java:133)
> at org.jboss.smartfrog.loaddriver.CompoundRequestProcessorFactoryImpl$CompoundRequestProcessor.processRequest(CompoundRequestProcessorFactoryImpl.java:52)
> at org.jboss.smartfrog.loaddriver.Runner.run(Runner.java:103)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> Example - timeline:
> {code}
> 4 servers are running, 2000 clients, no issue in the beggining
> 2018/02/21 10:16:43:234 EST [INFO ][TestController] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Node 0 (perf18) is down.
> - still no issues
> 2018/02/21 10:17:43:234 EST [INFO ][TestController] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Bringing back node 0 (perf18)
> 2018/02/21 10:17:55:611 EST [INFO ][TestController] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Node 0 (perf18) is up.
> 2018/02/21 10:18:00:272 EST [WARN ][Runner - 16] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Error sampling data: <org.jboss.smartfrog.loaddriver.RequestProcessingException: Stale session data received. Expected 60, received 59, Runner: 16>
> {code}
> perf20 started logging the errors meanwhile and stopped only after perf19 was shut down:
> {code}
> [JBossINF] [0m[0m10:17:50,227 INFO [org.infinispan.CLUSTER] (thread-19,ejb,perf20) ISPN000094: Received new cluster view for channel ejb: [perf19|5] (4) [perf19, perf20, perf21, perf18]
> [JBossINF] [0m[0m10:17:50,228 INFO [org.infinispan.CLUSTER] (thread-19,ejb,perf20) ISPN100000: Node perf18 joined the cluster
> [JBossINF] [0m[31m10:18:00,256 ERROR [org.infinispan.interceptors.impl.InvocationContextInterceptor] (timeout-thread--p12-t1) ISPN000136: Error executing command PrepareCommand, writing keys [SessionCreationMetaDataKey(f25_wwXXpvtNC9QTHZUtSwhmEUmg-wI77RL_17b9), SessionAccessMetaDataKey(f25_wwXXpvtNC9QTHZUtSwhmEUmg-wI77RL_17b9), SessionAttributeKey(f25_wwXXpvtNC9QTHZUtSwhmEUmg-wI77RL_17b9[1])]: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 30250 from perf18
> [JBossINF] at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:163)
> [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:86)
> [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:21)
> [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [JBossINF] at java.lang.Thread.run(Thread.java:748)
> [JBossINF]
> {code}
> There were no issues in this scenario in EAP 7.1.0.GA.
> Links:
> Clients: http://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/job/perflab_eap-7x-fa...
> perf18: http://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/job/perflab_eap-7x-fa...
> perf20: http://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/job/perflab_eap-7x-fa...
> Another test:
> perf19: http://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/job/perflab_eap-7x-fa...
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
8 years, 3 months
[JBoss JIRA] (WFLY-9990) module not found exception org.kohsuke.rngom
by Stuart Douglas (JIRA)
[ https://issues.jboss.org/browse/WFLY-9990?page=com.atlassian.jira.plugin.... ]
Stuart Douglas resolved WFLY-9990.
----------------------------------
Fix Version/s: (was: 12.0.0.Final)
Resolution: Rejected
There are no references to org.kohsuke.rngom in our module.xml files. Running 'grep -R kohsuke .' on a newly built WF upstream shows no matches.
> module not found exception org.kohsuke.rngom
> --------------------------------------------
>
> Key: WFLY-9990
> URL: https://issues.jboss.org/browse/WFLY-9990
> Project: WildFly
> Issue Type: Bug
> Components: JAXR
> Affects Versions: 10.0.0.Final, 12.0.0.Final
> Reporter: Amine Achouri
> Assignee: Kurt Stam
>
> before the version 9.0 final, the module org.kohsuke.rngom exist as Jboss or wildfly module. but with the newer versions of wildfly 9.0, 10, 11 and the versions of Jboss EAP 7.0 and 7.1, this module was replaced by the jar rngom-2.2.11.jbossorg-1, directly putted near jaxb-iml on the path "com/sun/xml/bind/", but what we remark is that the module.xml com/sun/xml/bind/main still referencing org.kohsuke.rngom as module and this what generate the module not found exception:
> the solution, to avoid such module not found exception, that worked with me was to delete the reference of org.kohsuke.rngom as a module and this work fine since it will be implicitly used by jaxbimpl.
> need to validate this.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
8 years, 3 months