[jboss-jira] [JBoss JIRA] Updated: (JGRP-507) CloserThread's attempt to interrupt TimeScheduler on closure could be end up being ignored

Galder Zamarreno (JIRA) jira-events at lists.jboss.org
Fri May 11 08:10:52 EDT 2007


     [ http://jira.jboss.com/jira/browse/JGRP-507?page=all ]

Galder Zamarreno updated JGRP-507:
----------------------------------

    Description: 
A race condition in JGroups could cause a channel that should be closed (for example, after being shunned) 
to never be closed.

In order to stop TimeScheduler thread, CloserThread set's TimeScheduler's thread status 
as interrupted. If the interruption occurs while TimeScheduler is waiting, then no problems.

But, in TimeScheduler._run(), actual running of a task via task.run(); happens outside
synchronized(queue) block which means that CloserThread could set the TimeSchedule thread's 
status as interrupted while the task is running, for example, sending an FD are-you-alive message.

If down the protocol that's carrying out the task, all down threads are set to false, and TimeScheduler 
thread is interrupted while the task is running, the interruption could be caught while sending a message to network:

TP (UDP and TCP/TCP_NIO's parent):

TP.down(Event evt)
....

try {
if(use_outgoing_packet_handler)
outgoing_queue.put(msg);
else
send(msg, dest, multicast);
}
catch(QueueClosedException closed_ex) {
}
catch(InterruptedException interruptedEx) {
}
catch(Throwable e) {
if(log.isErrorEnabled()) log.error("failed sending message", e);
} 

Catching InterruptedException and doing nothing will clear the Thread's interrupted status. If 
the interruption from CloserThread is caught here, TimeScheduler thread will never finished, 
leaving the channel blocked and never rejoining the cluster and unable to merge back.

  was:
A race condition in JGroups could cause a channel that should be closed (for example, after being shunned) 
to never be closed.

In order to stop TimeScheduler thread, CloserThread set's TimeScheduler's thread status 
as interrupted. If the interruption occurs while TimeScheduler is waiting, then no problems.

But, in TimeScheduler._run(), actual running of a task via task.run(); happens outside
synchronized(queue) block which means that CloserThread could set the TimeSchedule thread's 
status as interrupted while the task is running, for example, sending an FD are-you-alive message.

If down the protocol that's carrying out the task, all down threads are set to false, and TimeScheduler 
thread is interrupted while the task is running, the interruption could be caught while sending a message to network:

TP (UDP and TCP/TCP_NIO's parent):

TP.down(Event evt)
....

try {
if(use_outgoing_packet_handler)
outgoing_queue.put(msg);
else
send(msg, dest, multicast);
}
catch(QueueClosedException closed_ex) {
}
catch(InterruptedException interruptedEx) {
}
catch(Throwable e) {
if(log.isErrorEnabled()) log.error("failed sending message", e);
} 

Catching InterruptedException and doing nothing will clear the Thread's interrupted status. If 
the interruption from CloserThread is caught here, TimeScheduler thread will never finished, 
leaving the channel blocked and never rejoining the cluster.




> CloserThread's attempt to interrupt TimeScheduler on closure could be end up being ignored
> ------------------------------------------------------------------------------------------
>
>                 Key: JGRP-507
>                 URL: http://jira.jboss.com/jira/browse/JGRP-507
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 2.3, 2.3 SP1
>            Reporter: Galder Zamarreno
>         Assigned To: Bela Ban
>
> A race condition in JGroups could cause a channel that should be closed (for example, after being shunned) 
> to never be closed.
> In order to stop TimeScheduler thread, CloserThread set's TimeScheduler's thread status 
> as interrupted. If the interruption occurs while TimeScheduler is waiting, then no problems.
> But, in TimeScheduler._run(), actual running of a task via task.run(); happens outside
> synchronized(queue) block which means that CloserThread could set the TimeSchedule thread's 
> status as interrupted while the task is running, for example, sending an FD are-you-alive message.
> If down the protocol that's carrying out the task, all down threads are set to false, and TimeScheduler 
> thread is interrupted while the task is running, the interruption could be caught while sending a message to network:
> TP (UDP and TCP/TCP_NIO's parent):
> TP.down(Event evt)
> ....
> try {
> if(use_outgoing_packet_handler)
> outgoing_queue.put(msg);
> else
> send(msg, dest, multicast);
> }
> catch(QueueClosedException closed_ex) {
> }
> catch(InterruptedException interruptedEx) {
> }
> catch(Throwable e) {
> if(log.isErrorEnabled()) log.error("failed sending message", e);
> } 
> Catching InterruptedException and doing nothing will clear the Thread's interrupted status. If 
> the interruption from CloserThread is caught here, TimeScheduler thread will never finished, 
> leaving the channel blocked and never rejoining the cluster and unable to merge back.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        



More information about the jboss-jira mailing list