[jboss-jira] [JBoss JIRA] Created: (JGRP-1246) FILE_PING: NullPointerException on empty/incorrect file, and the communication is dead

Victor N (JIRA) jira-events at lists.jboss.org
Sun Oct 10 14:32:39 EDT 2010


FILE_PING: NullPointerException on empty/incorrect file, and the communication is dead
--------------------------------------------------------------------------------------

                 Key: JGRP-1246
                 URL: https://jira.jboss.org/browse/JGRP-1246
             Project: JGroups
          Issue Type: Bug
    Affects Versions: 2.10
            Reporter: Victor N
            Assignee: Bela Ban


If there is an empty or bad file in the directory (due to some reason - maybe, one of nodes had crashed during file write), you will get the following exception:

java.lang.NullPointerException                                                                                                                                          
        at org.jgroups.protocols.FILE_PING.handleView(FILE_PING.java:146)                                                                                                          
        at org.jgroups.protocols.FILE_PING.down(FILE_PING.java:116)                                                                                                                
        at org.jgroups.protocols.MERGE2.down(MERGE2.java:155)                                                                                                                      
        at org.jgroups.protocols.FD_SOCK.down(FD_SOCK.java:332)                                                                                                                    
        at org.jgroups.protocols.FD.down(FD.java:276)                                                                                                                              
        at org.jgroups.protocols.VERIFY_SUSPECT.down(VERIFY_SUSPECT.java:69)                                                                                                       
        at org.jgroups.protocols.BARRIER.down(BARRIER.java:91)                                                                                                                     
        at org.jgroups.protocols.pbcast.NAKACK.down(NAKACK.java:639)                                                                                                               
        at org.jgroups.protocols.UNICAST.down(UNICAST.java:444)                                                                                                                    
        at org.jgroups.protocols.pbcast.STABLE.down(STABLE.java:297)                                                                                                               
        at org.jgroups.protocols.pbcast.GMS.installView(GMS.java:596)                                                                                                              
        at org.jgroups.protocols.pbcast.GMS.installView(GMS.java:516)                                                                                                              
        at org.jgroups.protocols.pbcast.ClientGmsImpl.becomeSingletonMember(ClientGmsImpl.java:344)                                                                                
        at org.jgroups.protocols.pbcast.ClientGmsImpl.joinInternal(ClientGmsImpl.java:93)                                                                                          
        at org.jgroups.protocols.pbcast.ClientGmsImpl.join(ClientGmsImpl.java:38)                                                                                                  
        at org.jgroups.protocols.pbcast.GMS.down(GMS.java:922)                                                                                                                     
        at org.jgroups.protocols.FC.down(FC.java:431)                                                                                                                              
        at org.jgroups.protocols.FRAG2.down(FRAG2.java:154)                                                                                                                        
        at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:894)                                                                                                            
        at org.jgroups.JChannel.downcall(JChannel.java:1649)                                                                                                                       
        at org.jgroups.JChannel.connect(JChannel.java:420)                                                                                                                         
        ... 136 more

This occurs at EVERY node, after that the whole communication is terminated. I even did not find any jgroups threads after that.
Also, you can not connect new nodes after that - JChannel.connect() crashes for the same reason.
The problem was reproduced today in our production system.

Workaround: 

I would propose the following 2 fixes:
1) when reading files, do not add null/empty/bad entries
2) [for better reliability] surround the whole FILE_PING.handleView() with try/catch (maybe, for any Discovery protocol?) - even if Discovery fails, all other parts should NOT fail.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the jboss-jira mailing list