[mod_cluster-dev] Session temporarily unavailable

Bela Ban bban at redhat.com
Wed Jul 22 11:14:58 EDT 2009



Paul Ferraro wrote:
>>> When I have 2 nodes (node1, node2), and a session which is created on
>>> node1 (and replicated to node2), the following can happen:
>>>
>>> * I shut down node1 (CTRL-C)
>>> * node1 terminates cleanly
>>> * I access the webapp, but get a "Session temporarily unavailable"
>>> failure message (I guess a 500)
>>> * When I check mod-cluster-manager, information about node1 is still
>>> there !
>>> * Ca 5 seconds later, mod-cluster-manager doesn't show node1 anymore
>>> * When I now access the webapp, I get the proper failed over session
>>> on node2 and all my data is still there
>>>
>>>
>>> So my question is why doesn't node2 immediately tell httpd/mod-cluster
>>> that node1 is gone ? It seems that httpd *itself* only learns about 
>>> this
>>> when it pings the socket to node1...
> Bela Ban wrote:
>
> I should be. Do you see the appropriate STOP_APP, REMOVE_APP messages
> in your httpd access log?

I do, right when I kill the node. I think the delay I've been seeing 
could have been caused by having both ModClusterListener *and* the 
mod-cluster integration bean in server.xml...

I don't see this issue anymore (I've also upgraded to mod-cluster 1.0.1 
in the meantime)... I'll keep trying to reproduce this.

>>> I recall Paul once telling me we hadn't implemented that functionality,
>>> but I guess by now this is surely implemented ?
>
> I never said that, but I think I know of the conversation to which you 
> refer...
> If you are running mod_cluster via HAModClusterService and node1 crashed
> (instead of a clean shut down), node2 will not send any MCMP messages to
> httpd on behalf of node1 when it receives a view change.

Ah, yes, I remember now ! I just killed -9 node1 and mod-cluster-manager 
didn't remove node1 from its worker list.

Why don't we do this ? We cannot rely on a node being shutdown gracefully !

This could be very simple logic, executed by the singleton:

On becoming singleton:
- Get the current view and remove all nodes from httpd which are not in 
the current view

On view change:
- Same as above, with the view shipped as argument to the view change

We have to think whether this would make sense for node additions, but 
then again every new node registers itself, so maybe this is not necessary.

WDYT ?

-- 
Bela Ban
Lead JGroups / Clustering Team
JBoss


More information about the mod_cluster-dev mailing list