[jboss-jira] [JBoss JIRA] (JGRP-1457) TimeScheduler2 loses tasks
David Hotham (JIRA)
jira-events at lists.jboss.org
Sun Apr 22 07:44:17 EDT 2012
David Hotham created JGRP-1457:
----------------------------------
Summary: TimeScheduler2 loses tasks
Key: JGRP-1457
URL: https://issues.jboss.org/browse/JGRP-1457
Project: JGroups
Issue Type: Bug
Affects Versions: 3.0.9
Reporter: David Hotham
Assignee: Bela Ban
The symptoms I sometime see are: broadcast messages not being delivered to a member.
I've tracked this down to being because NAKACK2 has gaps in its record of sequence numbers, and its RetransmitTask is not running. I've confirmed that the task is not running by calling stack.getTransport().dumpTimerTasks() and seeing that it is not among the scheduled tasks.
So far, so definite. I also have a theory about how this happens.
Suppose thread 1 is in TimeScheduler2._run(), and has got as far as executing some tasks but has not yet reached the line tasks.keySet().removeAll(keys).
Meanwhile, suppose thread 2 is in TimeScheduler2.schedule(), adding a task that has the same key as the just-executed task. It can reach the branch task.remove(key) ("// entry has completed; remove it"), go round the loop again, and successfully call tasks.putIfAbsent(key, task).
Now thread 1 picks up again, calls removeAll(keys), and removes the task that has just been scheduled. Oops.
I suggest that a likely fix is to delete the "else tasks.remove(key)" branch from schedule() altogether. (If we're in that branch then we're blocked by a completed entry. That entry will be removed shortly by the run() thread, and then we'll be able to progress).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list