cos

You have two web servers, and one load balancer. Every connection comes into the load balancer, which then decides which of the two web servers to send that connection to; the web server handles it, and the connection is closed. Connections are coming in at a rate of a few hundred a minute, and each of them takes a few seconds to complete, so each web server typically has a few connections open at any given time. The load balancer knows how many connections each server has open.

If the load balancer always picked the server with the fewest current connections, for each new connection (or picked at random if both have the same number), then load would be very evenly balanced - each web server would have the same number of open connections, or one would have 1 more than the other.

However, it may be desireable to avoid sending the same user's connections to different web servers on the same visit to the site. Each user typically makes many connections, seconds or minutes apart. So we change the load balancer algorithm a little bit:

A "place" is a /28 IP range, but if you're not an Internet geek you can get away with just assuming that a "place" is a physical location - a house, an office, a wireless cafe. Multiple people may be browsing from the same place, but the load balancer can't tell the difference.

At first blush, it seems like if you forget any place that hasn't connected in the past 20 minutes, and you don't have a significant percentage of connections coming from the same place (or the same few places), this should still distribute load fairly evening. However, I recently observed a pattern like this:

A much larger number of people than usual visited the site during a half hour period.

Web server #1 saw a sudden spike from about 2.5 connections per second to about 6-7 connections per second, in less than a minute. The high rate continued for about 20 minutes, then sharply dropped back to the normal rate of about 2.5 connections per second.

Web server #2 saw a gradual climb, over the course of about five minutes, from 2.5 conn/s to about 5 conn/s. After 5 more minutes it peaked at around 5.5, then slowly went down, and eventually gradually came down to about 2.5 conn/s.

Over the course of the highest-traffic 20 minutes, Web server #1 received a total of 35% more connections than Web server #2.

Under what circumstances would the load balancing algorithm I describe behave like this?

Assumptions (aka observed facts):
- Connections were coming in from a wide range of places, with no one place accounting for 1% or more

Variables (things which define the "circumstances" under which the algorithm behaves differently):
- Time to complete a connection can vary between under 1 second and as many as 30 seconds.
- Time to complete a connection could partially depend on number of current connections
- Distribution of places that make few connections vs. places that make more connections can vary widely. Maybe every place that connects connects 100-400 times; or maybe 50% connect just once or twice each, while the other 50% connect many times each.

S	M	T	W	T	F	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28

Most Popular Tags

books - 1 use
cats - 2 uses
essay - 3 uses
events - 3 uses
food - 2 uses
frff - 2 uses
friends - 5 uses
fun - 6 uses
geek - 9 uses
help/advice - 5 uses
jobs - 5 uses
language - 3 uses
livejournal - 4 uses
movies - 2 uses
music - 8 uses
net - 7 uses
other - 13 uses
photos - 1 use
places - 5 uses
politics - 21 uses
poll - 3 uses
tech - 2 uses
travel - 9 uses
videos - 2 uses

Flat | Top-Level Comments Only

From:

tisiphone.livejournal.com

That's easy - Slashdot's front page refreshed :)

(Or someone died and New York Times tossed about a million or so breaking news alerts, or something else which generated a brief burst of interest in a particular site.)

Unrelated question: does the overhead of remembering the previous load outweigh the performance benefits from it?

cos

Nonono, the question "why were a lot more people than usual accessing the site at that time" is not interesting (and in this case, the exact cause was obvious and unsurprising). The interesting question is about the uneven distribution of load between the two servers.

Oh... because load balancers with conditions don't work properly to actually balance loads, basically. Consider if you had two different points of interest - but one was more interesting than the other. Say, a breaking news alert about Anna Nicole's death versus a really awesome, but long, new review for something shiny and technophilic (please excuse my lousy metaphorical setup, I'm really, really exhausted). In this case, Anna Nicole kicking it is likely to generate a spike of general interest, which will result in the Server #1 scenario - a high spike, with traffic remaining high. The second would result in the Server #2 scenario - steady, but less high, traffic (evened out by the fact that there are words of more than one syllable on the page, leading to lower numbers of page refreshes because things actually have to be read). If traffic is referred to one server if it has handled a request for the same site within the last 20 minutes, it's obvious that two separate events, one not as widespread as the other, would create an uneven load balance.

That's still not the question I'm posing, which is a *math* question not a sysadmin question. However, I do want to try to understand what you're saying - I'm not sure that I do (though I understand enough to know it's addressing a different question :).

You seem to be saying that two different kinds of traffic would somehow be sent to different servers, and I don't understand why you suggest that. Load balancer doesn't care what people are looking at. If there's an event with sudden spikey interest, it would still get balanced among the two servers, using the same algorithm. Some people would read about Anna Nicole from server #1, some would read about it from #2 (both serve identical content, of course - that's the whole point of a load balancer). So I don't understand how your last statement follows from the existence of different kinds of events that you describe.

OK, I think I was misreading your question then - it appeared you were balancing from the server side, not the client side. Thanks for the further explanation.

or to reword what I'm asking:

n this case, Anna Nicole kicking it is likely to generate a spike of general interest, which will result in the Server #1 scenario - a high spike, with traffic remaining high.

True, but that high spike, with traffic remaining high, would be distributed among both servers, because it's traffic from a large number of new locations most of which have not visitied in the past 20 minutes. Theoretically, you could then get different behavior from the set that got sent to #1 vs. the set that got sent to #2. Probablistically, though, that'd be weird - there's no reason I can think of for such a bias to be likely; usually, people sent to #1 and #2 should show the same average behavior afterward.

Practical Puzzle for Math Geeks

Navigation

Practical Puzzle for Math Geeks

no subject

no subject

no subject

no subject

no subject

no subject

Profile

February 2025

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags