[
http://jira.jboss.com/jira/browse/JGRP-500?page=comments#action_12362001 ]
Michael Newcomb commented on JGRP-500:
--------------------------------------
I'm not sure what you mean by malicious.
For a program to block, there must be some sort of synchronization. How can a program
block using the channel without acquiring a 'lock' on the channel when they are
using it?
IMHO, the flush message should simply wait up until the timeout. If it can't complete
the flush during the specified timeout, it fails.
If you execute the program in 2 windows, execute 'sleep 10' in the first and then
execute 'flush' in the second you will see the problem.
Flush will timeout, then start retrying. However, it can't complete successfully
because for some reason the 'done sleeping' response never made it back to the
windows who executed 'sleep 10'. Therefore, it can never complete block() and the
flush fails.
I just re-tested and if you execute flush within the 4000ms timeout (still using JGroups
2.5 beta1) for the first phase, it succeeds... So, I think the problem is in how flushes
are re-attempted. Something to do with resetting the flush or something...???
FLUSH bug
---------
Key: JGRP-500
URL:
http://jira.jboss.com/jira/browse/JGRP-500
Project: JGroups
Issue Type: Bug
Reporter: Bela Ban
Assigned To: Vladimir Blagojevic
Fix For: 2.5
Attachments: JGroupsFlushBug.java
[reported by Michael Newcomb]
Here is a test program to demonstrate the bug.
Start 2 instances of the JGroupsFlushBug app.
In window 1, type: 'sleep 5'
This will send a message to both apps and they will both sleep for 5
seconds. The app that issued the sleep command will wait until both have
slept 5 seconds and then return.
When the sleep command is issued, it grabs a lock on a Semaphore. This
lock is released when it receives a response to the sleep command.
When a block() is called it grabs the lock on the Semaphore so that no
more sleep commands can execute. Likewise, if a sleep command is
executing, block() will wait until it completes and then grab the lock.
At least that is what is supposed to happen ;)
Now, in window 1, type: 'sleep 10'
Quickly change to window 2 and type: 'flush'
You will see that the flush does not complete correctly. You will see
the flush timeout and then repeatedly tries to start the flush again.
Even when the sleep completes, the retries fail...
Can someone confirm my results?
Thanks,
Michael
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira