David Hotham created JGRP-1457:
----------------------------------
Summary: TimeScheduler2 loses tasks
Key: JGRP-1457
URL:
https://issues.jboss.org/browse/JGRP-1457
Project: JGroups
Issue Type: Bug
Affects Versions: 3.0.9
Reporter: David Hotham
Assignee: Bela Ban
The symptoms I sometime see are: broadcast messages not being delivered to a member.
I've tracked this down to being because NAKACK2 has gaps in its record of sequence
numbers, and its RetransmitTask is not running. I've confirmed that the task is not
running by calling stack.getTransport().dumpTimerTasks() and seeing that it is not among
the scheduled tasks.
So far, so definite. I also have a theory about how this happens.
Suppose thread 1 is in TimeScheduler2._run(), and has got as far as executing some tasks
but has not yet reached the line tasks.keySet().removeAll(keys).
Meanwhile, suppose thread 2 is in TimeScheduler2.schedule(), adding a task that has the
same key as the just-executed task. It can reach the branch task.remove(key) ("//
entry has completed; remove it"), go round the loop again, and successfully call
tasks.putIfAbsent(key, task).
Now thread 1 picks up again, calls removeAll(keys), and removes the task that has just
been scheduled. Oops.
I suggest that a likely fix is to delete the "else tasks.remove(key)" branch
from schedule() altogether. (If we're in that branch then we're blocked by a
completed entry. That entry will be removed shortly by the run() thread, and then
we'll be able to progress).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see:
http://www.atlassian.com/software/jira