>> 2. Can httpd detect hung nodes? A hung node will not affect
the
>> connected state of the AJP/HTTP/S connector - it could only detect this
>> by sending data to the connector and timing out on the response.
>
> The hung node will be detected and marked as broken but the
> corresponding request(s) may be delayed or lost due to time-out.
>
How long does this take, say in a typical case where the hung node was
up and running with a pool of AJP connections open? Is it the 10 secs,
the default value of the "ping" property listed at
https://www.jboss.org/mod_cluster/java/properties.html#proxy ?
cping/cpong is done in the Connector I was thinking of nodeTimeout.
Also, if a request is being handled by a hung node and the
HAModClusterService tells httpd to stop that node, the request will
fail, yes? It shouldn't just fail over, as it may have already caused
the transfer of my $1,000,000 to my secret account at UBS. Failing over
would cause transfer of a second $1,000,000 and sadly I don't have that
much.
maxAttempts = 0 controls that.
>>
>> And some questions for open discussion:
>> What does HAModClusterService really buy us over the normal
>> ModClusterService? Do the benefits outweigh the complexity?
>> * Maintains a uniform view of proxy status across each AS node
>> * Can detect and send STOP-APP/REMOVE-APP messages on behalf of
>> hung/crashed nodes (if httpd cannot already do this) (not yet
>> implemented)
>> + Requires special handling of network partitions
>> * Potentially improve scalability by minimizing network traffic for
>> very large clusters.
Assume a near-term goal is to run a 150 node cluster with say 10 httpd
servers. Assume the background thread runs every 10 seconds. That comes
to 150 connections per second across the cluster being opened/closed to
handle STATUS. Each httpd server handles 15 connections per second.
That is very little :-)
With HAModClusterService the way it is now, you get the same, because
besides STATUS each node also checks its ability to communicate w/ each
httpd in order to validate its ability to become master. But let's
assume we add some complexity to allow that health check to become much
more infrequent. So ignore those ping checks. So, w/ HAModClusterService
you get 1 connection/sec being opened closed across the cluster for
status, 0.1 connection/sec per httpd. But the STATUS request sent
across each connection has a much bigger payload.
How significant is the cost of opening/closing all those connections?
They are keepalived http connections so it is only receive / send on the
socket.
Cheers
Jean-Frederic