The markets did great this week with the Dow hitting record highs and closing in on 13,000 however almost no one here at StockCharts.com was paying much attention. As most ChartWatchers know, we spent much of the week wrestling with technical glitches. I thought I'd take some time to explain what we've learned about the problems and the steps we are taking to prevent them from happening again. If you are not interested in computers and networks, now might be a good time to skip down to the other articles .
About a year ago we started upgrading all of the equipment here at StockCharts.com from the slower 100 Megabit networking speed to the newer 1 Gigabit speed. (Most home networking equipment works at 100 Megabits although - like us - you can upgrade your stuff to 1 Gigabit relatively inexpensively these days.) Upgrading our network to the faster speed has many benefits to all of our users: our servers send around stock price data faster, the charts we create get sent out faster, we can backup our server data faster, etc. In order to upgrade a network, you have to replace (or upgrade) both the computers and the switches on the network. (A switch is a device that connects all of the wires from all the different computers. Most home networks have a switch built into the router/firewire device that the broadband modem plugs into.)
Now, there is a hidden problem with upgrading the speed of any network - a problem that most of the network equipment people don't tell you about. With few exceptions, there is always a point where your high speed network meets a slower speed device. In our case, our three connections to the Internet work at 45 Megabits and so, at some point, all of our outbound traffic has to slow dramatically in order to get out one of those wires.
The situation is analogous to a sink with a slow drain and a big faucet. The slow drain represents the slow connections to the Internet, The big faucet represents the fast connections to our charting servers, and the water represents all of the bits that make up our charts. The overall goal of the network is to keep the sink from overflowing.
If the water is able to go down the drain as fast as it is coming out of the faucet, everything is fine. The sink remains almost completely empty. Even if there are occasional high-speed bursts of water from the faucet, things are probably fine also. The extra water just stays in the sink until the drain has a chance to "catch up." The sink "buffers" the extra water for the drain.
Problems happen when the amount of water coming out of the faucet exceeds the amount of water going down the drain for a "long time" and the sink becomes completely full. At that point, any additional water that comes out of the faucet will get spilled (i.e., lost).
Coming back to the world of networking, this process of "buffering" (i.e., the sink) happens inside whichever device is connecting the high-speed network to the slower speed network - typically the switch (or the router/modem in most homes).
Now, when we started to upgrade our network to gigabit speed, the first thing we did was go out and buy some very nice, high-speed switches from a very well known network equipment manufacturer. Where a consumer level gigabit switch might cost $50 these days, the ones we got cost several thousand dollars (which is typical for enterprise networking). In return for that money, we supposedly got three things - long-lasting hardware, big sinks, and software that would tell us if the sinks ever overflowed. (See where I'm going with this?)
Ultimately, most of last week's problems were caused by a buffer overflowing inside one of those new switches. That is no surprise to any of us - it was one of the first things we looked for. The bigger problem was that everyone was confused by four facts:
-
The switch didn't give us any indication that it was having problems.
-
Everything had been working great up until last Wednesday.
-
Even at our busiest times, we were only sending out about 70 megabits of data - much less than the 100 megabits that our "drain" allows.
-
When data went out through our slower ISP, everything worked fine.
Ironically, the answer to the mystery lay in the article that I wrote in the last newsletter - the one where I sort of bragged about how much faster we were able to generate charts these days. By increasing the speed at which we create our charts, we metaphorically increased the speed at which water was bursting into the sink from the faucet. The result was an overwhelmed sink and thus, data loss.
The immediate solution was to slow our network back down to 100 megabits. That smoothed out the flow of data and stopped the data loss at the switch. Obviously that is not the right long-term solution though because we lose all of the other advantages of gigabit networking. The long-term solution is to upgrade our switches to ones with HUGE sinks (i.e., memory buffers) which we will be doing this weekend. Once that work is complete, you can expect our site to be faster than ever.
In case you missed the announcements on the website, we have credited ALL subscribers with an additional two free weeks of service to make up for last week's problems. Thanks for continuing to support StockCharts.com.