]
Jean-Frederic Clere closed MODCLUSTER-527.
------------------------------------------
Resolution: Won't Fix
Load Balancing logic can fail production environments
-----------------------------------------------------
Key: MODCLUSTER-527
URL:
https://issues.redhat.com/browse/MODCLUSTER-527
Project: mod_cluster
Issue Type: Bug
Components: Native (httpd modules)
Environment:
Reporter: Sean Cavanagh
Assignee: Jean-Frederic Clere
Priority: Major
We suffered the same problem as MODCLUSTER-100 in our production environment.
If any balance pool sees 0 requests during 2*LBstatusRecalTime the next request will go
to the first worker in the pool. If an application consistently sends requests at a rate
less than 2*LBstatusRecalTime, then all requests will be sent to the same node, and
_balancing will be completely broken_
Less worrisome, but still not ideal, if one worker is more heavily balanced than the
others (e.g. it's load factor has been manually set to '1' in order to offload
it; a somewhat common practice), that worker will still get a session once every
2*LBstatusRecalTime.
I know it seems odd to hear about an application which can load a server by sending fewer
than 1 request every 10 seconds, however, if one is using sticky sessions, one is not
balancing _requests_, one is balancing _logins_. The usage pattern for our enterprise
application sees our users login relatively slowly, and once a user has their JSESSIONID,
those http requests never see the mod_proxy_cluster balancing algorithm again.
It's nice that there is a config parameter now, which will allow us to balance
properly again, but I would offer that since the behaviour of external users can
completely break the algorithm this bug requires a more sophisticated fix.
If authoritative parties agree it's a good idea, I'm happy to write a patch that
does the following:
1) improves the documentation, more clearly explaining how the algorithm functions and
alerting users to this parameter
2) clean up a couple of straight-up documentation errors
3) Modifies the internal_find_best_byrequest formula from:
{code:c}
status = lbstatus + (elected - oldelected) * 1000)/lbfactor;
{code}
to:
{code:c}
status = lbstatus + (elected - oldelected + 1) * 1000)/lbfactor
{code}
It seems to me that the current formula assumes that each node would be winning at least
some elections during the LBstatusRecalTime window, so by adding 1 we can ensure that
assumption is always true, and no worker ever gets a score of 0.
I did a bit of algebra and I'm pretty confident that this change won't influence
the outcome of any elections. That could only happen if:
{noformat}
lfc - lfw > lfc*lsw - lfw*lsc
{noformat}
where lfc is the client load factor, lsc is the client's lbstatus, and simlar for lfw
and lsw.
Since the election numbers are multiplied by 1000, and the max difference between two
load factors is 99, I conclude it is impossible for the addition of a single election to
both workers to change the outcome of the election.
So yeah, let me know.