[jboss-jira] [JBoss JIRA] (WFWIP-28) [Artemis 2.x upgrade] Unexptected crash of server in SOAK test

Thu Aug 9 09:26:01 EDT 2018

    [ https://issues.jboss.org/browse/WFWIP-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13617058#comment-13617058 ] 

Francesco Nigro commented on WFWIP-28:
--------------------------------------

[~mnovak] I think that it could depend by several factors: Netty (newer version) default configuration changes/behaviours, NIO that is now performing per-thread pooling (it could put more pressure on compaction), etc etc
The point is that if something in the broker has changed to make it more resource hungry is not trivial to find the reason: just consider how the GC work...
The new versions of the broker perform much less minor garbage collector pause than 1.x that means that the native memory have much less chances to be cleaned up (a minor GC will clean up native memory and file descriptors/sockets), leading to higher peak of resource utilisations.
Having less pauses is a pure optimisation that turn to put more pressure on native resources, but I can't consider that an "issue" per se.

> [Artemis 2.x upgrade] Unexptected crash of server in SOAK test
> --------------------------------------------------------------
>
>                 Key: WFWIP-28
>                 URL: https://issues.jboss.org/browse/WFWIP-28
>             Project: WildFly WIP
>          Issue Type: Bug
>          Components: Artemis
>            Reporter: Miroslav Novak
>            Assignee: Francesco Nigro
>            Priority: Blocker
>              Labels: feature-branch-blocker
>         Attachments: sosreport-rvaisdebug.asd-20180807101658.tar.xz
>
>
> After ~13 hours there is unexpected crash of one server in SOAK test. There is no error/warning in the logs. 
> Test Scenario:
> * Start 2 servers 
> * Client sends messages to input queue. Messages then go through:
> * One server to another through MDB reading and sending them from remote container through resource adapter
> * Messages are forwarded from one server to another over JMS bridge and back over Core bridge
> * Messages have JMSReplyTo defined with a temporary queue, that is filled with responses for the client
> * Messages are read from the destination with stateless EJB and sent back to clients
> * Client reads the messages after the pass through all the soak modules.
> Pass Criteria: In the last step receiver consumes all messages sent by producer.
> Actual Result:
> After ~13 hours 1st server suddenly crashes. There is no error/warning in server logs.
> Issue was hit with Artemis 2.5.0 with https://github.com/jmesnil/wildfly/tree/WFLY-9407_upgrade_artemis_2.4.0_with_prefix (commit 51dd8102f103ccb0470a3cfc8713d3f9bdb1b65d)

--
This message was sent by Atlassian JIRA
(v7.5.0#75005)