[JBoss JIRA] (WFCORE-2959) Export a low MALLOC_ARENA_MAX value in standalone.conf and domain.conf
by Tomaz Cerar (JIRA)
[ https://issues.jboss.org/browse/WFCORE-2959?page=com.atlassian.jira.plugi... ]
Tomaz Cerar updated WFCORE-2959:
--------------------------------
Comment: was deleted
(was: [~andy.miller] reading http://man7.org/linux/man-pages/man3/mallopt.3.html it seems that your proposed change
{noformat} export MALLOC_ARENA_MAX=${MALLOC_ARENA_MAX:-1} {noformat}
is wrong as default value is 0 not -1
does -1 have any special meaning?)
> Export a low MALLOC_ARENA_MAX value in standalone.conf and domain.conf
> ----------------------------------------------------------------------
>
> Key: WFCORE-2959
> URL: https://issues.jboss.org/browse/WFCORE-2959
> Project: WildFly Core
> Issue Type: Enhancement
> Components: Scripts
> Reporter: Brian Stansberry
> Assignee: Tomaz Cerar
>
> This is a task that came out of research done by [~andy.miller] on WF/EAP memory use in cloud environments.
> Our launch scripts for *nix environments should set MALLOC_ARENA_MAX, e.g.
> {code}
> export MALLOC_ARENA_MAX=${MALLOC_ARENA_MAX:-1}
> {code}
> See http://info.prelert.com/blog/java-8-and-virtual-memory-on-linux for background.
> The default glibc settings of allowing up to 128 malloc arenas make very little sense for a java application, since the vm asks the os for a few large memory allocations and then manages those areas itself. Leaving that default setting in place will result in a very large virtual memory size for the VM. It's just virtual, not resident, memory, so controlling this to a large extent is just a matter of having a better image for people who don't understand the distinction. But as is mentioned in the linked blog post's discussion of mlockall() and Elasticsearch, there may be some more concrete implications as well.
> The initial recommendation on this was to set the value to 1, but this bears some research/discussion. For example hadoop uses 4 (https://issues.apache.org/jira/browse/HADOOP-7154), and a bit of googling of "MALLOC_ARENA_MAX java" leads to some entries mentioning using 2 or 3.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
8 years, 10 months
[JBoss JIRA] (WFCORE-2959) Export a low MALLOC_ARENA_MAX value in standalone.conf and domain.conf
by Tomaz Cerar (JIRA)
[ https://issues.jboss.org/browse/WFCORE-2959?page=com.atlassian.jira.plugi... ]
Tomaz Cerar commented on WFCORE-2959:
-------------------------------------
[~andy.miller] reading http://man7.org/linux/man-pages/man3/mallopt.3.html it seems that your proposed change
{noformat} export MALLOC_ARENA_MAX=${MALLOC_ARENA_MAX:-1} {noformat}
is wrong as default value is 0 not -1
does -1 have any special meaning?
> Export a low MALLOC_ARENA_MAX value in standalone.conf and domain.conf
> ----------------------------------------------------------------------
>
> Key: WFCORE-2959
> URL: https://issues.jboss.org/browse/WFCORE-2959
> Project: WildFly Core
> Issue Type: Enhancement
> Components: Scripts
> Reporter: Brian Stansberry
> Assignee: Tomaz Cerar
>
> This is a task that came out of research done by [~andy.miller] on WF/EAP memory use in cloud environments.
> Our launch scripts for *nix environments should set MALLOC_ARENA_MAX, e.g.
> {code}
> export MALLOC_ARENA_MAX=${MALLOC_ARENA_MAX:-1}
> {code}
> See http://info.prelert.com/blog/java-8-and-virtual-memory-on-linux for background.
> The default glibc settings of allowing up to 128 malloc arenas make very little sense for a java application, since the vm asks the os for a few large memory allocations and then manages those areas itself. Leaving that default setting in place will result in a very large virtual memory size for the VM. It's just virtual, not resident, memory, so controlling this to a large extent is just a matter of having a better image for people who don't understand the distinction. But as is mentioned in the linked blog post's discussion of mlockall() and Elasticsearch, there may be some more concrete implications as well.
> The initial recommendation on this was to set the value to 1, but this bears some research/discussion. For example hadoop uses 4 (https://issues.apache.org/jira/browse/HADOOP-7154), and a bit of googling of "MALLOC_ARENA_MAX java" leads to some entries mentioning using 2 or 3.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
8 years, 10 months
[JBoss JIRA] (WFLY-8929) Race condition if timers overlap due to long running execution and short schedules if database persistence is used
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/WFLY-8929?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on WFLY-8929:
-----------------------------------------------
wfink(a)redhat.com changed the Status of [bug 1461416|https://bugzilla.redhat.com/show_bug.cgi?id=1461416] from ASSIGNED to POST
> Race condition if timers overlap due to long running execution and short schedules if database persistence is used
> ------------------------------------------------------------------------------------------------------------------
>
> Key: WFLY-8929
> URL: https://issues.jboss.org/browse/WFLY-8929
> Project: WildFly
> Issue Type: Bug
> Components: EJB
> Environment: Configure DB persistence for timers as file-persistence will not have a persistence check for shouldRun to lock the timer execution.
> Reporter: Wolf-Dieter Fink
> Assignee: Wolf-Dieter Fink
> Attachments: server-extract.log, server1.log
>
>
> If timers (here calendar timer) are running longer than scheduled, or the schedule/processing get stuck do to thread or cpu bottleneck, it is possible that the updates for persistence overlap.
> The issue seems that the task(1) try to finish the timer and task(2) is about to start but see the concurrency.
> The DB is updated with the 'old' next timeout, but the internal Timer instance will be updated with the next possible schedule due to a race condition between the two threads updating the object.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
8 years, 10 months
[JBoss JIRA] (JGRP-1958) RequestCorrelator "channel is not connected" error during shutdown
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/JGRP-1958?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on JGRP-1958:
-----------------------------------------------
Jiří Bílek <jbilek(a)redhat.com> changed the Status of [bug 1399195|https://bugzilla.redhat.com/show_bug.cgi?id=1399195] from VERIFIED to ASSIGNED
> RequestCorrelator "channel is not connected" error during shutdown
> ------------------------------------------------------------------
>
> Key: JGRP-1958
> URL: https://issues.jboss.org/browse/JGRP-1958
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.2.12
> Reporter: Dennis Reed
> Assignee: Bela Ban
> Priority: Minor
> Fix For: 3.6.5
>
>
> Error logged during shutdown of a channel due to RequestCorrelator failing to send a reply:
> ERROR [org.jgroups.protocols.UNICAST2] (OOB-17,shared=tcp) couldn't deliver OOB message [dst: server1/web, src: server2/web (4 headers), size=62 bytes, flags=OOB|DONT_BUNDLE|RSVP]: java.lang.IllegalStateException: channel is not connected
> at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.down(MessageDispatcher.java:617) [jgroups-3.2.12.Final-redhat-1.jar:3.2.12.Final-redhat-1]
> at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:544) [jgroups-3.2.12.Final-redhat-1.jar:3.2.12.Final-redhat-1]
> at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:391) [jgroups-3.2.12.Final-redhat-1.jar:3.2.12.Final-redhat-1]
> at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:249) [jgroups-3.2.12.Final-redhat-1.jar:3.2.12.Final-redhat-1]
> at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:600) [jgroups-3.2.12.Final-redhat-1.jar:3.2.12.Final-redhat-1]
> [incoming JGroups message]
> It appears to just be a timing issue between shutdown of the channel and RequestCorrelator processing the message, which triggers a response message.
> It would be good to either avoid triggering the exception in the first place, or suppress the error log during shutdown.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
8 years, 10 months
[JBoss JIRA] (WFLY-8939) Clustering creating a large number of ServiceName instances
by Brian Stansberry (JIRA)
[ https://issues.jboss.org/browse/WFLY-8939?page=com.atlassian.jira.plugin.... ]
Brian Stansberry updated WFLY-8939:
-----------------------------------
Attachment: servicenameparse.txt
> Clustering creating a large number of ServiceName instances
> -----------------------------------------------------------
>
> Key: WFLY-8939
> URL: https://issues.jboss.org/browse/WFLY-8939
> Project: WildFly
> Issue Type: Bug
> Components: Clustering
> Reporter: Brian Stansberry
> Assignee: Paul Ferraro
> Attachments: servicenameparse.txt
>
>
> A downside of the shift to capability-driven service wiring has been a significant increase in memory use related to ServiceName instances. A ServiceName is basically a linked list of a string wrapper objects, with the wrapper objects sharable between lists. In the old way of hand creating ServiceNames the wrapper objects would typically be shared widely since a new name would be created by appending a new wrapper to the last element in an existing chain.
> With capabilities, this breaks down as ServiceNames are being created by parsing a dot-separated string. Names created that way no longer share elements and multiple instances of strings like "org" and "wildlfy" end up being kept in memory.
> https://issues.jboss.org/browse/WFCORE-2895 is about improving this in the kernel. But analyzing the result of my effort there didn't show as big of an impact as I expected. Investigating why, I see a lot of uses of ServiceName.parse in the clustering code, which will have the same effect. I'll attach the results of "git grep ServiceName.parse" against the full WildFly code base.
> I don't know how easily this can be resolved. The kernel may need to provide a capability-name to ServiceName utility that can do optimizations.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
8 years, 10 months
[JBoss JIRA] (WFLY-8939) Clustering creating a large number of ServiceName instances
by Brian Stansberry (JIRA)
Brian Stansberry created WFLY-8939:
--------------------------------------
Summary: Clustering creating a large number of ServiceName instances
Key: WFLY-8939
URL: https://issues.jboss.org/browse/WFLY-8939
Project: WildFly
Issue Type: Bug
Components: Clustering
Reporter: Brian Stansberry
Assignee: Paul Ferraro
A downside of the shift to capability-driven service wiring has been a significant increase in memory use related to ServiceName instances. A ServiceName is basically a linked list of a string wrapper objects, with the wrapper objects sharable between lists. In the old way of hand creating ServiceNames the wrapper objects would typically be shared widely since a new name would be created by appending a new wrapper to the last element in an existing chain.
With capabilities, this breaks down as ServiceNames are being created by parsing a dot-separated string. Names created that way no longer share elements and multiple instances of strings like "org" and "wildlfy" end up being kept in memory.
https://issues.jboss.org/browse/WFCORE-2895 is about improving this in the kernel. But analyzing the result of my effort there didn't show as big of an impact as I expected. Investigating why, I see a lot of uses of ServiceName.parse in the clustering code, which will have the same effect. I'll attach the results of "git grep ServiceName.parse" against the full WildFly code base.
I don't know how easily this can be resolved. The kernel may need to provide a capability-name to ServiceName utility that can do optimizations.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
8 years, 10 months
[JBoss JIRA] (WFCORE-2961) Server booting with unconfigured https management shows https port as -1
by Ken Wills (JIRA)
[ https://issues.jboss.org/browse/WFCORE-2961?page=com.atlassian.jira.plugi... ]
Ken Wills commented on WFCORE-2961:
-----------------------------------
Yeah, most likely. I should have also pointed out that it may have only been tested against standalone.
The property being used is: -Djboss.management.http.port=0 I need to have a look at where it actually interprets the 0 as special, and see if thats also true for -Djboss.management.https.port=0 etc.
> Server booting with unconfigured https management shows https port as -1
> ------------------------------------------------------------------------
>
> Key: WFCORE-2961
> URL: https://issues.jboss.org/browse/WFCORE-2961
> Project: WildFly Core
> Issue Type: Task
> Reporter: Ken Wills
> Assignee: Ken Wills
>
> See commit: 47984e987dff4cf218fde952a8bc28a75ad71f31 in core, changes from WFCORE-2771.
> This may work correctly in full, so its a minor issue in core, but this needs to be checked.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
8 years, 10 months
[JBoss JIRA] (JGRP-2172) Non-blocking flow control
by Radim Vansa (JIRA)
[ https://issues.jboss.org/browse/JGRP-2172?page=com.atlassian.jira.plugin.... ]
Radim Vansa commented on JGRP-2172:
-----------------------------------
But I would prefer this feature to become UFC2/MFC2 or another fancy protocol name so we can easily check for regressions in the implementation.
> Non-blocking flow control
> -------------------------
>
> Key: JGRP-2172
> URL: https://issues.jboss.org/browse/JGRP-2172
> Project: JGroups
> Issue Type: Feature Request
> Reporter: Bela Ban
> Assignee: Bela Ban
> Fix For: 4.0.4
>
>
> Sending a message through FlowControl (UFC, MFC) should not block if {{Message.Flag.NB_FC}} (non-blocking flow control) is set.
> Instead, the message should be added to a queue (bounded if {{max_size}} > 0, else unbounded). The max queue size is given in bytes, so we can estimate what the memory penalty for reaching that size would be (if bounded).
> The queued messages are sent when credits arrive. TBD: when credits arrive, should blocked threads or queued messages be released first?
> Non-blocking flow control can be used by both external and internal threads.
> If the queue is unbounded, then it is the responsibility of the application (e.g. Infinispan) to make sure the queue doesn't grow to an untenable size.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
8 years, 10 months