[jboss-jira] [JBoss JIRA] (JGRP-2135) OOM with JGroups 3.6.11.

Zoltan Farkas (JIRA) issues at jboss.org
Wed Nov 23 10:16:00 EST 2016


    [ https://issues.jboss.org/browse/JGRP-2135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13327208#comment-13327208 ] 

Zoltan Farkas edited comment on JGRP-2135 at 11/23/16 10:15 AM:
----------------------------------------------------------------

Another minor issue in  TcpConnection line 269 (I commented the code with Z: ):

{code}
        public void run() {
            Throwable t=null;
            while(canRun()) {
                Buffer data=null;
                try {
                    data=send_queue.take();
                    if(data.hashCode() == termination.hashCode()) // Z: data is being de-referenced
                        break;
                }
                catch(InterruptedException e) {
                    t=e;
                    break;
                }

                if(data != null) { // Z: data cannot be null, since it is already de-referenced previously. also send_queue.take never returns null...
                    try {
                        _send(data.getBuf(), 0, data.getLength(), false, send_queue.isEmpty());
                    }
                    catch(Throwable ignored) { //Z: unrecoverable exceptions should probably not be ignored.
                        t=ignored;
                    }
                }
            }
            server.notifyConnectionClosed(TcpConnection.this, String.format("%s: %s", getClass().getSimpleName(),
                                                                            t != null? t.toString() : "normal stop"));
        }
{code}


was (Author: zolyfarkas):
Another minor issue in  TcpConnection line 269 (I commented the code with Z:):

{code}
        public void run() {
            Throwable t=null;
            while(canRun()) {
                Buffer data=null;
                try {
                    data=send_queue.take();
                    if(data.hashCode() == termination.hashCode()) // Z: data is being de-referenced
                        break;
                }
                catch(InterruptedException e) {
                    t=e;
                    break;
                }

                if(data != null) { // Z: data cannot be null, since it is already de-referenced previously. also send_queue.take never returns null...
                    try {
                        _send(data.getBuf(), 0, data.getLength(), false, send_queue.isEmpty());
                    }
                    catch(Throwable ignored) { //Z: unrecoverable exceptions should probably not be ignored.
                        t=ignored;
                    }
                }
            }
            server.notifyConnectionClosed(TcpConnection.this, String.format("%s: %s", getClass().getSimpleName(),
                                                                            t != null? t.toString() : "normal stop"));
        }
{code}

> OOM with JGroups 3.6.11.
> ------------------------
>
>                 Key: JGRP-2135
>                 URL: https://issues.jboss.org/browse/JGRP-2135
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 3.6.11
>            Reporter: Zoltan Farkas
>            Assignee: Bela Ban
>             Fix For: 3.6.12, 4.0
>
>
> We are running our JVMs with : -XX:OnOutOfMemoryError="kill -9 %p" 
> we have been experiencing OOMs fairly often, and the OOMs happen at:
> {code}
> Object / Stack Frame                                                              |Name                                                                                             | Shallow Heap | Retained Heap |Context Class Loader                         |Is Daemon
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> java.lang.Thread @ 0x81bdf838                                                     |Connection.Receiver [144.77.77.53:50363 - 144.77.77.53:50363],sis-cluster.service,prodpmwsv5-6461|          120 |           456 |sun.misc.Launcher$AppClassLoader @ 0x800175a8|false
> |- at java.lang.OutOfMemoryError.<init>()V (OutOfMemoryError.java:48)             |                                                                                                 |              |               |                                             |
> |- at org.jgroups.blocks.cs.TcpConnection$Receiver.run()V (TcpConnection.java:310)|                                                                                                 |              |               |                                             |
> |- at java.lang.Thread.run()V (Thread.java:745)                                   |                                                                                                 |              |               |                                             |
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> {code}
> the Code where it happens is in TcpConnection.java:
> {code}
> while(canRun()) {
>                 try {
>                     int len=in.readInt();
>                     if(buffer == null || buffer.length < len)
>                         buffer=new byte[len];
>                     in.readFully(buffer, 0, len);
>                     updateLastAccessed();
>                     server.receive(peer_addr, buffer, 0, len);
>                 }
>                 catch(OutOfMemoryError mem_ex) {
>                     t=mem_ex;
>                     break; // continue;
>                 }
>                 catch(IOException io_ex) {
>                     t=io_ex;
>                     break;
>                 }
>                 catch(Throwable e) {
>                 }
>             }
> {code}
> when allocating:   buffer=new byte[len];
> it looks to me that some invalid large value is received and the process OOMs when allocating a huge byte array
> Running JVMs without kill on OOM would make this issue "dissapear" in the sense that it is swallowed by:
> {code}
>                 catch(OutOfMemoryError mem_ex) {
>                     t=mem_ex;
>                     break; // continue;
>                 }
> {code}
> Handling OutOfMemoryError is a strange implementation choice... 
> instead a size limit should be employed to protect from receiving invalid sizes...
> My heap limit is 1GB and my heap dumps are 50Mb so the attempted allocation size is huge...



--
This message was sent by Atlassian JIRA
(v7.2.3#72005)


More information about the jboss-jira mailing list