[jboss-jira] [JBoss JIRA] Commented: (JGRP-500) FLUSH bug

Michael Newcomb (JIRA) jira-events at lists.jboss.org
Thu May 10 14:13:52 EDT 2007


    [ http://jira.jboss.com/jira/browse/JGRP-500?page=comments#action_12362001 ] 
            
Michael Newcomb commented on JGRP-500:
--------------------------------------

I'm not sure what you mean by malicious.

For a program to block, there must be some sort of synchronization. How can a program block using the channel without acquiring a 'lock' on the channel when they are using it?

IMHO, the flush message should simply wait up until the timeout. If it can't complete the flush during the specified timeout, it fails.

If you execute the program in 2 windows, execute 'sleep 10' in the first and then execute 'flush' in the second you will see the problem.

Flush will timeout, then start retrying. However, it can't complete successfully because for some reason the 'done sleeping'  response never made it back to the windows who executed 'sleep 10'. Therefore, it can never complete block() and the flush fails.

I just re-tested and if you execute flush within the 4000ms timeout (still using JGroups 2.5 beta1) for the first phase, it succeeds... So, I think the problem is in how flushes are re-attempted. Something to do with resetting the flush or something...???


> FLUSH bug
> ---------
>
>                 Key: JGRP-500
>                 URL: http://jira.jboss.com/jira/browse/JGRP-500
>             Project: JGroups
>          Issue Type: Bug
>            Reporter: Bela Ban
>         Assigned To: Vladimir Blagojevic
>             Fix For: 2.5
>
>         Attachments: JGroupsFlushBug.java
>
>
> [reported by Michael Newcomb]
> Here is a test program to demonstrate the bug.
> Start 2 instances of the JGroupsFlushBug app.
> In window 1, type: 'sleep 5'
> This will send a message to both apps and they will both sleep for 5
> seconds. The app that issued the sleep command will wait until both have
> slept 5 seconds and then return.
> When the sleep command is issued, it grabs a lock on a Semaphore. This
> lock is released when it receives a response to the sleep command.
> When a block() is called it grabs the lock on the Semaphore so that no
> more sleep commands can execute. Likewise, if a sleep command is
> executing, block() will wait until it completes and then grab the lock.
> At least that is what is supposed to happen ;)
> Now, in window 1, type: 'sleep 10'
> Quickly change to window 2 and type: 'flush'
> You will see that the flush does not complete correctly. You will see
> the flush timeout and then repeatedly tries to start the flush again.
> Even when the sleep completes, the retries fail...
> Can someone confirm my results?
> Thanks,
> Michael

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        



More information about the jboss-jira mailing list