[
https://issues.jboss.org/browse/JGRP-1432?page=com.atlassian.jira.plugin....
]
Bela Ban commented on JGRP-1432:
--------------------------------
So are you saying that there is no memory issue; the heap isn't exhausted ? In that
case, there's a problem creating new threads, either because the ulimit for the
process was exhausted or because of some other restrictions (modified SLES?).
Re your config: STABLE.max_bytes should not be 0 or else you'll run out of memory if
you send a lot of messages, VIEW_SYNC can be removed, too
OutOfMemoryError in GMS
-----------------------
Key: JGRP-1432
URL:
https://issues.jboss.org/browse/JGRP-1432
Project: JGroups
Issue Type: Bug
Affects Versions: 2.12.2
Environment: Modified SLES
Reporter: Peter Nerg
Assignee: Bela Ban
Attachments: tcp-fileping.xml
When running in a cluster with only two nodes we every now and then see issues that
JGroups fails to start a thread due to OOM.
The stack trace always points to the same place hence so it should rule out any other
part of the application.
Also taking a heap dump immediately after the OOM yields no obvious cause to the OOM.
It makes we wonder if there is a scenario where JGroups goes wild and starts to create
lots of threads.
The stack trace looks like this (often a number of OOM exceptions in a row)
2012-02-21 08:56:52,679 [ OOB-1,null] ERROR [org.jgroups.protocols.TCP] failed
handling incoming message
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:640)
at org.jgroups.protocols.pbcast.GMS$ViewHandler.start(GMS.java:1297)
at org.jgroups.protocols.pbcast.GMS$ViewHandler.add(GMS.java:1260)
at org.jgroups.protocols.pbcast.GMS.up(GMS.java:801)
at org.jgroups.protocols.VIEW_SYNC.up(VIEW_SYNC.java:170)
at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:246)
at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:703)
at org.jgroups.protocols.BARRIER.up(BARRIER.java:101)
at org.jgroups.protocols.FD.up(FD.java:275)
at org.jgroups.protocols.MERGE2.up(MERGE2.java:210)
at org.jgroups.protocols.Discovery.up(Discovery.java:294)
at org.jgroups.stack.Protocol.up(Protocol.java:413)
at org.jgroups.protocols.TP.passMessageUp(TP.java:1109)
at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1665)
at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1647)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
The above stack trace is often preceded by the following printout:
2012-02-21 04:39:28,949 [ Timer-2,<ADDR>] WARN [org.jgroups.protocols.FILE_PING]
failed reading 9875802e-272a-0bcc-d1db-466d80f188b2.node: removing it
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see:
http://www.atlassian.com/software/jira