[jboss-jira] [JBoss JIRA] Resolved: (JGRP-664) CopyOnWrite collections should be synchronized

Bela Ban (JIRA) jira-events at lists.jboss.org
Tue Jan 22 09:40:22 EST 2008


     [ http://jira.jboss.com/jira/browse/JGRP-664?page=all ]

Bela Ban resolved JGRP-664.
---------------------------

    Fix Version/s:     (was: 2.4.2)
       Resolution: Done

> CopyOnWrite collections should be synchronized
> ----------------------------------------------
>
>                 Key: JGRP-664
>                 URL: http://jira.jboss.com/jira/browse/JGRP-664
>             Project: JGroups
>          Issue Type: Bug
>            Reporter: Bela Ban
>         Assigned To: Bela Ban
>             Fix For: 2.6.2, 2.7
>
>
> [email from Rick Pike]
> Sometimes when we have nodes start suspecting each other we see intermittent:
> 2008-01-14 06:49:37,464 [ERROR] UDP failed handling incoming message
> java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
>         at
> java.util.concurrent.CopyOnWriteArrayList.rangeCheck(CopyOnWriteArrayList.java:708)
>         at
> java.util.concurrent.CopyOnWriteArrayList.get(CopyOnWriteArrayList.java:328)
>         at org.jgroups.protocols.FD.getPingDest(FD.java:151)
>         at org.jgroups.protocols.FD.up(FD.java:305)
>         at org.jgroups.protocols.FD_ICMP.up(FD_ICMP.java:108)
>         at org.jgroups.protocols.MERGE3.up(MERGE3.java:126)
>         at org.jgroups.protocols.Discovery.up(Discovery.java:246)
>         at
> org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1535)
>         at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1484)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
>         at java.lang.Thread.run(Thread.java:595)
> It looks like there's 4 places in FD that modify the pingable_mbrs list:
> 1. up(HEARTBEAT_ACK)
> 2. up(SUSPECT)
> 3. down(VIEW_CHANGE)
> 4. down(UNSUSPECT)
> Only #1 and #3 appear to synchronize{} around the edits, and I believe that
> a concurrent (UN)SUSPECT message is emptying that list whille the
> HEARTBEAT_ACK is looping and repeatedly reading - inside getPingDest().
> Every so often, it calls get(0) on an empty list.
> We just started seeing this when we started testing with FD_ICMP, but I
> imagine this would happen with any flavor, and we're seeing it due to the
> high volume of SUSPECT messages happening in our tests.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        



More information about the jboss-jira mailing list