Global Web Site Performance Improvement One source of frustration shared by every administrator is
varied perception. One user may deem performance slow while another finds it
fantastic. To further complicate the situation, both users may be right. As
we’ll see later, your website may be slow and fast. Since few customers contact
your company with thank you notes after a wonderful internet experience, your
inboxes are more likely to fill with complaints rather than compliments. Those
gripes may incorrectly skew the perception of your website’s performance as
they darken your boss's view of your administration skills. A comprehensive picture is necessary to
clarify the situation and provide benchmarks against which we can measure
future improvements. In order to gauge the validity of customer complaints and
judge the extent to which users experience latency, it is necessary to measure
your website’s performance. To ensure an accurate portrait, multiple
benchmarking agents should be positioned in a manner that reflects your
customer base. If you have customers scattered throughout the globe and a
single perl script takes measurements, then your data hardly reflects a random
customer’s experience. If your company has numerous access points you may be
able to leverage its existing infrastructure in order to build a comprehensive
model. For administrators lacking multiple access points, there are other
options. You can rent server accounts and collect your own data from various
access points or you can hire a web monitoring service that provides data
points that reflect your user base. Either way, it’s important to get a comprehensive picture of your site’s
performance. Many administrators are lulled to complacency by either
incomplete or inaccurate measurements. Your website may be very responsive
inside corporate headquarters, but how does it play in Peoria? One of the more
telling views of your data is performance by geographic area. [See Figure 1]
The most striking thing about this view is the variance from city to city. Now
we can see different perceptions for the same website. For a customer in San
Jose, performance is exceptional. For the unlucky Frankfurt resident,
performance is “too slow.” How can two people have such very different
experiences on the same site? Certainly they may have different hardware, but
this data was collected with identical configurations. The answer is distance.
The customer in San Jose sits less than thirty miles from the website while the
one in Frankfurt is half a world away. Despite our best efforts, geographic
discrepancies continue to add latency. Thomas Friedman’s world may be flat, but
its continents are separated by thousands of miles. As a result, “your site is
still slow.” Before we continue, it’s important to note what we mean by
“latency.” For our purposes, it is the lapse between an HTTP request and
content download. It is commonly defined it as the lag between request and
response, but that doesn’t serve us well for a number reasons. For one thing,
humans generally don’t read HTTP response headers; they read web content. A
website customer continues to wait after the headers come over the wire. As far
as they’re concerned, the page didn’t load until it rendered inside their
browser. Our goal is to improve the experience of customers, not Internet
crawlers. The definition was chosen for another reason. It matches our
measurement tool. The mean times on Figure 1 represent the time it took to
download the web page and all it’s elements. Latency is added as a result of performance degradation in
each of three major segments of an HTTP transaction. They are listed in order
from the web server to the customer: · First mile – page generation to Internet
access point. · Middle mile – the Internet backbone. · Last mile – from Internet backbone
through the user’s ISP. To understand how latency is accumulated in each of these
areas, it’s important to examine them in greater detail. Most developers and administrators concentrate their
efforts on the first mile. It is, after all, the area in which they maintain
the greatest control. Any reasonably competent team with a respectable budget
can build a site that can render a page in under a second. Given slightly
deeper pockets, our reasonably competent team should be able to launch that
site inside a quality data center with fast internet access. While there is
always room to improve first mile performance, a time will come where you can
no longer bleed the stone. The return on investment is not worth the cost. The middle mile spans the internet “backbone” which is an
archaic term for a series of very fast networks linked globally from city to
city by a series of high-speed lines. Internet service providers connect to the
“backbone” through a series of TCP/IP routers. Throughout the middle mile,
latency is added by several factors. The main contributing factor is distance.
The further a packet must travel, the longer it takes to complete its round
trip time (RTT). Lengthy round trips usually occur through a series of routers.
The store and forward nature of routers contributes latency. After the entire
packet is read into memory, the device must parse its header then determine
where next to send it. Another problem is TCP/IP itself. Each packet must be
verified. If delivery verification fails, then the packet is resent after a
mandatory timeout. If packet loss is especially high, the middle mile will be
characterized by high latency. The third major area in which we accrue latency is the
final mile, from the ISP’s peering point to the end user’s computer. For a
website whose clientele consists of thousands or millions of unique visitors
per month, it isn’t practical or cost-effective to upgrade each user’s
connection speed. While there are actions we can take to improve final mile
performance, most latency accrued there is beyond our capacity to improve. Performance Degradation Now that we understand where latency is amassed, let’s
revisit the data displayed in Figure 1. The website whose performance was
measured resides in the San Francisco Bay area. The page we monitored was
relatively light; it included just fifteen elements. Given what we know about
performance degradation, it is not shocking that customers in California
experience the best performance while those on other continents experience the
worst. Our unfortunate customers in China were forced to wait six times longer
for the same content. According to Jupiter Research, if a page requires more
than eight seconds to load, customers will leave the site. As our Asian
customers click their way through meatier portions of the site, they will have
little trouble reaching that threshold. Since the business sponsor was
committed to delivering content to a global audience, it was necessary to get
international performance in line with North America. While global performance improvement was deemed a priority,
most latency was added in areas outside of our control. If we were granted the
resources and opportunity to stock the middle mile with unlimited bandwidth and
the most robust routers available, latency would still be a problem. As the
crow flies, the distance between Frankfurt and San Francisco is 5693 miles. The
speed of light is capped at 186,000 miles per second. In perfect conditions, a
signal could follow our crow’s flight pattern in 32 milliseconds. The round
trip is complete in 64 milliseconds. Under such conditions, the multithreaded
agent in Frankfurt should complete its task in 1.15 seconds (101 TCP packets
handled by four threads). In reality, the task took 5-1/4 seconds. Where did we
amass the extra time? As detailed earlier, packets traverse the middle mile
through a series of cables and routers. Each router adds latency. In this
scenario, each TCP packet must traverse 12 routers – 6 per leg. If each router
adds 2 milliseconds latency, then another 2.2 seconds are added to our total.
The final discrepancy is explained with packet loss. TCP is designed to avoid
packet loss. The sender must receive acknowledgement for every packet sent. If
a sender does not receive verification before a set timeout (200 ms in most
cases), then it resends the packet and every one after it. If twenty packets
were sent and packet thirteen is lost, then packets 13 through 20 are
retransmitted. Solutions Products and services that correct Internet latency tend
to address it in one of the three areas we described, the first, middle or
final mile. While there are products that address final mile solutions, most
are beyond our control to implement. We can’t ask customers to upgrade their
Internet connection speeds nor increase the sizes of their local cache. For the
most part, we will focus on first and middle mile solutions. In some cases,
first mile solutions will improve final mile performance. First Mile Solutions The products in this category all have one thing in
common. They are implemented in the datacenter. The idea is to improve overall
performance by greatly improving the first mile. The faster you can get it out
the door and the lighter you make it, the quicker you can deliver it. The competitors in this area provide similar offerings
based on similar technologies. There is really only so much you can do to bleed
a stone. Appliance devices by F5, NetScaler, Red Line Networks and FineGround
provide inline compression and TCP optimization. Some offer HTTP protocol
optimization and caching. According to Gartner Research, you can expect a 2 to 1
performance enhancement provided by inline compression and TCP optimization.
With a product such one from FineGround with its added HTTP optimization, you
can expect a 3 to 6 times performance improvement. On the surface, those
numbers sound impressive but most administrators won’t experience this type of
performance enhancement. Why? If you care enough to read articles such as this
one, then you are probably already doing many of the same things the appliances
are doing. Good web systems administrators are already using mod_gzip
to compress everything that is compressible. They already use mod_expires or
mod_headers to set explicit cache directives on heavily requested elements.
Their content is quick to generate and light over the wire. If you’ve already
implemented many of the features offered by optimization appliances, then your
performance enhancement will fall short of the promises. This is not to say
these solutions are without merit. With compression offline, you could free CPU
cycles and extend the life of your servers. But at the same time, you could add
at least one high-end UNIX server for the price of these devices. Middle Mile Solutions As mentioned above, latency accrues with distance under
perfect conditions. Middle mile solutions that address this matter take one of
two forms. One is to move content closer to customers. Another is to correct
inherent problems associated with TCP/IP and Internet traffic routing. In this
section, we’ll consider several options that allow us to improve performance on
the middle mile. Distance delays can be corrected by simply distributing
content closer to users. If our Frankfurt customers pull content from London
rather than San Francisco, then 10,598 miles are shaved off their round
trip. The theoretical download time is
reduced from 1.15 seconds to seven tenths of a second. Caching reverse proxies can be employed to move content
closer to customers. To implement such a solution, you will need to position
servers regionally. Geographic DNS will allow you to map host names by region.
If a user in Melbourne requests your server in San Francisco, geographic DNS
could send them instead to a caching server on the Pacific Rim. The proxy
serves static content from its cache and dynamic content directly from San
Francisco. The middle mile is dramatically reduced along with its latency.
Unfortunately this solution is not inexpensive. If your company has a global
infrastructure, you could leverage it to position redundant inexpensive caching
servers on all major continents. Geographic DNS servers are not inexpensive.
Indeed, their price tag may kill the project before it gets started.
Fortunately, there is an open source player in this field, PowerDNS. While it lacks it lacks some features offered
by its commercial counterparts, it may be more than adequate for your needs.
You may also consider outsourcing CNAMES to a Geographic DNS provider. Rather than implement a geographic caching system or
distributing load over regional web servers, you could acquire the services of
a specialist. Two of the biggest players in this field are Netli and Akamai.
Both rely on geographic caching to move content closer to end users, but Netli
offers a proprietary protocol that reduces some of the inherent weaknesses of
TCP/IP. Akamai offers its own networking
refinements. It employs routing optimization to improve its network
performance. In Figure 2, we see the same web server from Figure 1 but this time
its content is delivered through Netli’s web accelerator. The curve appears
similar, but now we the average download time from Beijing dropped from over
six seconds to little more than two. Final Mile Solutions For all practical purposes, final mile solutions will be
implemented in the first mile. Customer performance can be improved in the
final mile by compressing data in the first. We can reduce the number of server
requests with explicit caching directives on the server. The lighter our pages,
the faster they’ll move through a customer’s ISP. While you may have
implemented many of these solutions, there may still be room for improvement.
The savvy administrator will recognize the point of diminished return. Conclusion The first step toward global performance improvement is
comprehensive monitoring. As indicated earlier, it is important that monitoring
agents are dispersed so that they mirror your customer base. Each refinement
should be evaluated by its effect on performance data. A comprehensive picture
that reflects your customer base will help you make the business case for
additional hardware or services. A competent bean counter will never cut a
check based on an administrator’s hunches. In the real world, you need data. With adequate monitoring in place, you are ready to make
refinements to the area in which you have the most control. First mile tuning
can go a long way toward improving final mile performance. Make sure your
systems have adequate resources and they are tuned specifically for delivering
web content. O’Reilly’s Web Performance Tuning offers a great place to
start. Compress all the content you can. The smaller it is, the faster it
travels over the wire. You can decrease load and improve performance by setting
explicit caching directives. (See Sys Admin March 2005 • Volume 14 • Number 3) It
requires less time to pull an element from a local cache then it does to pull
it off a server one half a world away. As you hone your first mile configuration, check it against your monitoring data. If performance measurements meet expectations, then you may not need further refinement. Unfortunately, many administrators provide content to a global audience. Unless your content is very lean, chances are you will need to boost performance for customers half a world away.
By Jeffrey Fulmer (published in Sys Admin 2006)
For more than a decade, programmers have been refining the
art of web development while systems administrators have been honing
configurations, adding hardware and fattening network pipes. Upgrades at
peering points and transport backbones have dramatically improved network
performance. Over the last five years, most companies and homeowners have
improved their access speed. Despite all these enhancements, many users still
register the same complaint. “Your site is too slow!”
In most cases, distance is a primary culprit in the conspiracy to slow down the web. Performance could be greatly enhanced if we simply moved customers closer to the web site. In the real world, they rarely agree to such things. It’s better to move content closer to them. Perhaps you could serve European content in Europe and Asian content in Asia. Such a move creates logistical problems; it requires multiple content replications and it necessitates redundant hardware and data center expenses. If your primary site was expensive, how much will it cost to bring up another two? For this reason, it is often better to rely on geographically distributed caching proxies or accelerator services such as Netli or Akamai.
The world is getting flatter, but it doesn’t have to be slow. The tips in this article will help you deliver content quickly to all your customers, not just the ones next to your data center.
