[
https://issues.jboss.org/browse/MODCLUSTER-427?page=com.atlassian.jira.pl...
]
Aaron Ogburn commented on MODCLUSTER-427:
-----------------------------------------
Ah, thanks, JF, you are right. I hadn't looked into proxy_cluster_child_init() yet.
But checking it out, I see currently that it creates the proxy_cluster_watchdog thread,
but doesn't do anything itself to fill the balancer. The created
proxy_cluster_watchdog thread will fill the balancer after its first two seconds of sleep
though.
Further testing around this matched up. If I waited 2+ seconds for a subsequent request,
sticky sessions can work since the proxy_cluster_watchdog executed and filled the balancer
in the new process. If the next request comes in less than 2 seconds later,
proxy_cluster_watchdog didn't execute yet and so the balancer is not yet filled in the
new process, and sticky sessions are not maintained as mentioned in my original
description.
I modified my PRs to address proxy_cluster_child_init() since it's the origin of the
issue. I essentially added the update_workers_node calls here to ensure it is done
within init rather than lazily 2 seconds later when the created proxy_cluster_watchdog
first executes. This fixes my reproduction as well. What do you think?
mod_cluster can break stickiness for the first request on new child
processes
-----------------------------------------------------------------------------
Key: MODCLUSTER-427
URL:
https://issues.jboss.org/browse/MODCLUSTER-427
Project: mod_cluster
Issue Type: Bug
Security Level: Public(Everyone can see)
Components: Native (httpd modules)
Affects Versions: 1.2.9.Final, 1.3.1.Alpha1
Environment: JBoss EAP 6.3.0
Reporter: Aaron Ogburn
Assignee: Jean-Frederic Clere
Fix For: 1.2.10.Final, 1.3.1.Final
mod_cluster can break stickiness for the first request on new child processes. It looks
like this occurs specifically when "CreateBalancers 1" is used. Prefork
typically makes this much worse as well.
My debugging showed that the proxy_balancer would exist, but it would essentially be
empty in the new child process. find_session_route/find_route_worker would be called as
expected, but the for loop in find_route_worker wasn't even doing anything because
balancer->workers->nelts was 0. The balancer would then finally be populated in the
new child when the first request hits internal_find_best_byrequests and calls
update_workers_node.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)