Brian Stansberry wrote:
I don't see why a hard kill would cause 5xx responses to the
client
(except perhaps for a few requests that were being handled by the killed
server when it was killed.) The httpd side should try to route future
requests to the failed node and when it can't connect fail them over.
"Very long timeouts" on subsequent requests sounds like something that
should solvable via configuration as well. (Paul, JFC please comment.)
The backend node is dead; there's no reason it should be taking the
httpd side a "very long" time to detect that when a new request comes in.
(I'm not saying MODCLUSTER-66 isn't important; just that the symptoms
you are describing seem more severe than is warranted. If people aren't
using HAModClusterService, the MODCLUSTER-66 fix won't help them, but
they still shouldn't be seeing what you describe.)
Right. The errors I've been seeing are probably caused by a long timeout
value for ping/pong.
When the socket connection to worker W is closed, doesn't
httpd/mod-cluster automatically remove W from its worker list ? Can this
be configured ?
--
Bela Ban
Lead JGroups / Clustering Team
JBoss