[JBoss JIRA] Created: (JGRP-335) Hangs with FLUSH
by Bela Ban (JIRA)
Hangs with FLUSH
----------------
Key: JGRP-335
URL: http://jira.jboss.com/jira/browse/JGRP-335
Project: JGroups
Issue Type: Bug
Affects Versions: 2.3 SP1
Reporter: Bela Ban
Assigned To: Bela Ban
Fix For: 2.4
2 use cases where we can run into the problem (members A and B).
#1 View change
* A is running, B joins
* B is *not* blocking in FLUSH, A is blocking after START_FLUSH
* A starts the flush
* A returns the new view to B in a JOIN_RSP. This causes B's Channel.connect() to return
* B sends a unicast message to A, to which A sends a response *in the same thread* (service STATE_REQ)
* A competes the flush, multicasting a STOP_FLUSH message
* The STATE_REQ at A hangs on FLUSH.down()
* The STOP_FLUSH at A can never unblock FLUSH.down() because it was received *after* the STATE_REQ from B !
SOLUTION:
1. Make B block in FLUSH.down() as soon as the client sends the JOIN_REQ to A
2. Make STOP_FLUSH *synchronous*. This means we only return from Channel.connect() (for example) once every member has ack'ed the STOP_FLUSH. See next issue (state transfer) for a description of what happens if we don't do this.
#2 State transfer
* A and B are members of the group
* B calls Channel.getState()
* A and B receive a START_FLUSH, start the block in FLUSH
* State is transferred from A to B
* B multicasts a STOP_FLUSH and *immediately afterwards* sends a *unicast* message (which can 'pass' multicast messages, as they're unrelated)
* A happens to receive the unicast message *before* the STOP_FLUSH. The unicast blocks and the STOP_FLUSH, which would unblock it, cannot be delivered
SOLUTION:
1. Same as solution 2 above. If we make the STOP_FLUSH phase synchronous, connect() or getState() only return once everyone has been unblocked
LONG TERM SOLUTION:
* The much better solution of course is to make the STOP_FLUSH message *out-of-band*, so it can be delivered in parallel to other messages, and is not blocked (e.g.) by the unicast in the queue. So even if the unicast message was blocked waiting for STOP_FLUSH, once STOP_FLUSH has been received, it will be delivered, causing the unicast to unblock
* Once we have this solution in place (2.5, threadless stack and out-of-band messages), we can revert the STOP_FLUSH to only use 1 phase rather than 2
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
19 years, 7 months
[JBoss JIRA] Updated: (JBAS-2957) Replace HA-JNDI AutoDiscovery with JGroups based implementation
by Brian Stansberry (JIRA)
[ http://jira.jboss.com/jira/browse/JBAS-2957?page=all ]
Brian Stansberry updated JBAS-2957:
-----------------------------------
Fix Version/s: JBossAS-5.0.0.CR1
(was: JBossAS-5.0.0.Beta)
I'm changing this to 5.0.0.CR1 on the slight chance a good way to do this comes out of Remoting work. But it's on the knife edge of being 'No Release' (i.e. someday....) or 'Won't Fix'.
> Replace HA-JNDI AutoDiscovery with JGroups based implementation
> ---------------------------------------------------------------
>
> Key: JBAS-2957
> URL: http://jira.jboss.com/jira/browse/JBAS-2957
> Project: JBoss Application Server
> Issue Type: Feature Request
> Security Level: Public(Everyone can see)
> Components: Clustering
> Reporter: Jerry Gauthier
> Assigned To: Jerry Gauthier
> Priority: Minor
> Fix For: JBossAS-5.0.0.CR1
>
>
> HA-JNDI AutoDiscovery uses its own IP multicasting-based discovery mechanism. This is error-prone code and should be replaced with the well-tested JGroups framework.
> See JIRA issue JBCLUSTER-43 for further details.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
19 years, 7 months
[JBoss JIRA] Updated: (JBAS-1483) Black box testing for http session (with and without replication)
by Brian Stansberry (JIRA)
[ http://jira.jboss.com/jira/browse/JBAS-1483?page=all ]
Brian Stansberry updated JBAS-1483:
-----------------------------------
Fix Version/s: JBossAS-5.0.1.CR1
(was: JBossAS-5.0.0.Beta)
Since this issue was created, a large number of unit tests of failover session behavior in have been added. These include, perhaps accidentally, quite a bit of simple testing of non-failover behavior. We need to beef up the testing of the non-failover behavior. Good news is much of this behavior is inherited from the Tomcat code.
> Black box testing for http session (with and without replication)
> -----------------------------------------------------------------
>
> Key: JBAS-1483
> URL: http://jira.jboss.com/jira/browse/JBAS-1483
> Project: JBoss Application Server
> Issue Type: Feature Request
> Security Level: Public(Everyone can see)
> Components: Clustering
> Reporter: Ben Wang
> Fix For: JBossAS-5.0.1.CR1
>
> Original Estimate: 4 weeks
> Remaining Estimate: 4 weeks
>
> While talking with Ivelin, I think we need to have more black box testing for http requests with and without replication. Every feature in single node should also be supported bu clustering, plus some additional use caes for clustering failover. But currently, I only have limited unit test case for clustering in term of failing over.
> So here is two main steps:
> 1. Create an automated black box testing suite for http session request in a single node. We can use some tool to do this. Question is we need to define the tests.
> 2. Currently, Tomcat has a whole suite of unit testing on http session. We should look into that to see if we can re-use some of it in our own Tomcat module.
> 3. Create test cases for fail-over. I can help to define the set here.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
19 years, 7 months