[
https://issues.jboss.org/browse/JGRP-2135?page=com.atlassian.jira.plugin....
]
Zoltan Farkas edited comment on JGRP-2135 at 11/23/16 10:15 AM:
----------------------------------------------------------------
Another minor issue in TcpConnection line 269 (I commented the code with Z: ):
{code}
public void run() {
Throwable t=null;
while(canRun()) {
Buffer data=null;
try {
data=send_queue.take();
if(data.hashCode() == termination.hashCode()) // Z: data is being
de-referenced
break;
}
catch(InterruptedException e) {
t=e;
break;
}
if(data != null) { // Z: data cannot be null, since it is already
de-referenced previously. also send_queue.take never returns null...
try {
_send(data.getBuf(), 0, data.getLength(), false,
send_queue.isEmpty());
}
catch(Throwable ignored) { //Z: unrecoverable exceptions should
probably not be ignored.
t=ignored;
}
}
}
server.notifyConnectionClosed(TcpConnection.this, String.format("%s:
%s", getClass().getSimpleName(),
t != null?
t.toString() : "normal stop"));
}
{code}
was (Author: zolyfarkas):
Another minor issue in TcpConnection line 269 (I commented the code with Z:):
{code}
public void run() {
Throwable t=null;
while(canRun()) {
Buffer data=null;
try {
data=send_queue.take();
if(data.hashCode() == termination.hashCode()) // Z: data is being
de-referenced
break;
}
catch(InterruptedException e) {
t=e;
break;
}
if(data != null) { // Z: data cannot be null, since it is already
de-referenced previously. also send_queue.take never returns null...
try {
_send(data.getBuf(), 0, data.getLength(), false,
send_queue.isEmpty());
}
catch(Throwable ignored) { //Z: unrecoverable exceptions should
probably not be ignored.
t=ignored;
}
}
}
server.notifyConnectionClosed(TcpConnection.this, String.format("%s:
%s", getClass().getSimpleName(),
t != null?
t.toString() : "normal stop"));
}
{code}
OOM with JGroups 3.6.11.
------------------------
Key: JGRP-2135
URL:
https://issues.jboss.org/browse/JGRP-2135
Project: JGroups
Issue Type: Bug
Affects Versions: 3.6.11
Reporter: Zoltan Farkas
Assignee: Bela Ban
Fix For: 3.6.12, 4.0
We are running our JVMs with : -XX:OnOutOfMemoryError="kill -9 %p"
we have been experiencing OOMs fairly often, and the OOMs happen at:
{code}
Object / Stack Frame |Name
| Shallow Heap | Retained Heap |Context Class Loader |Is Daemon
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
java.lang.Thread @ 0x81bdf838
|Connection.Receiver [144.77.77.53:50363 -
144.77.77.53:50363],sis-cluster.service,prodpmwsv5-6461| 120 | 456
|sun.misc.Launcher$AppClassLoader @ 0x800175a8|false
|- at java.lang.OutOfMemoryError.<init>()V (OutOfMemoryError.java:48) |
| | | |
|- at org.jgroups.blocks.cs.TcpConnection$Receiver.run()V (TcpConnection.java:310)|
| | | |
|- at java.lang.Thread.run()V (Thread.java:745) |
| | | |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
{code}
the Code where it happens is in TcpConnection.java:
{code}
while(canRun()) {
try {
int len=in.readInt();
if(buffer == null || buffer.length < len)
buffer=new byte[len];
in.readFully(buffer, 0, len);
updateLastAccessed();
server.receive(peer_addr, buffer, 0, len);
}
catch(OutOfMemoryError mem_ex) {
t=mem_ex;
break; // continue;
}
catch(IOException io_ex) {
t=io_ex;
break;
}
catch(Throwable e) {
}
}
{code}
when allocating: buffer=new byte[len];
it looks to me that some invalid large value is received and the process OOMs when
allocating a huge byte array
Running JVMs without kill on OOM would make this issue "dissapear" in the sense
that it is swallowed by:
{code}
catch(OutOfMemoryError mem_ex) {
t=mem_ex;
break; // continue;
}
{code}
Handling OutOfMemoryError is a strange implementation choice...
instead a size limit should be employed to protect from receiving invalid sizes...
My heap limit is 1GB and my heap dumps are 50Mb so the attempted allocation size is
huge...
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)