in 2006

July 13, 2008

In late 2006, I had to prepare a report assessing a recommendation made to a large services firm by Microsoft Consulting. One of the questions I had to try and answer was, “How does Microsoft set up its online system?” I had the Jim Gray diagram which I referenced in this Web log essay “ in 1999”. To be forthright, I had not paid much attention to Microsoft because I was immersed in my Google research.

I poked around on various search systems, MSDN, and eventually found a diagram that purported to explain the layout of Microsoft’s online system. The information appeared in a PowerPoint presentation by Sunjeev Pandey, Senior Director Operations and Paul Wright, Technology Architect Manager, Operations. On July 13, 2008 the presentation was available here. The PowerPoint itself does not appear in the index. I cannot guarantee that this link will remain valid. Important documents about Microsoft’s own architecture are disappearing from MSDN and other Microsoft Web sites. I am reluctant to post the entire presentation even though it does not carry a Microsoft copyright.

I want to spell out the caveats. Some new readers of this Web log assume that I am writing news. I am not. The information in this essay is from June 2006, possibly a few months earlier. Furthermore, as I get new information, I reserve the right to change my mind. This means that I am not asserting absolutes. I am capturing my ideas as if I were Samuel Pepys writing in the 17th century. You want real news? Navigate elsewhere.

My notes suggest that Messrs Pandey and Wright prepared a PowerPoint deck for use in a Web case about Microsoft’s own infrastructure. These Web casts are available, but my Verizon wireless service times out when I try to view them. You may have better luck. in 2006

Here is a diagram from the presentation “ Design for Resilience. The Infrastructure of, Microsoft Update, and the Download Center.” The title is important because the focus is narrow compared to the bundle of services explained in Mr. Gray’s Three Talks PowerPoint deck and in Steven Levi and Galen Hunt “Challenges to Building Scalable Services.” In a future essay, I will comment on this shift. For now, let’s look at what’s architecture may have been in mid-2007.

2006 architecture Mid-2006

This architecture represents a more robust approach. Between 1995 and 2006, the number of users rose from 30,000 per day to about 17 million per day. In 2001, the baseline operating system was Windows 2000. The shift to Microsoft’s 64-bit operating system took place in 2005, a year in which (if Messrs Pandey and Wright are correct) experienced some interesting challenges. For example, international network service was disrupted in May and September of 2005. More tellingly, Microsoft was subject to Denial of Service attacks and experience network failures in April and May of 2005. Presumably, the mid-2006 architecture was designed to address these challenges.

The block diagram makes it clear that Microsoft wanted to deploy an architecture in 2006 that provided excellent availability and better performance via caching. The drawbacks are those that were part of the DNA of the original 1999 design–higher costs due to the scale up and out model and its use of name brand, top quality hardware and the complexity of the system. You can see four distinct tiers in the architecture.

Information has to move from the Microsoft Corp. network to the back end network tier. Then the information must move from the back end to the content delivery tier. Due to the “islands” approach that now includes distributed data centers, the information must propagate across data centers. Finally, the most accessed data or the highest priority information must be make available to the Akamai and Savvis “edge of network” systems. Microsoft, presumably to get engineering expertise and exercise better control of costs, purchased two adjoining data centers from Savvis in mid-2007 for about $200 million. (Note: for comparison purposes, keep in mind that Microsoft’s San Antonio data center cost about $600 to $650 million.)

Thus, Microsoft is retaining the “islands” approach. You can see these in the back end network tier in the green rectangle in the diagram. And in the seven years between the 1999 diagram and this 2006 diagram added a buffer, the content delivery tier” and expanded its relationship with vendors offering content delivery networks.

The system, therefore, makes its 1999 elements a core part of the 2006 architecture. To reduce bottle necks, Microsoft has embraced caching on a large scale. The result is a four tier set up which illustrates the scale up and out on a very large scale.

The architecture requires a patchwork quilt of technologies. In the information I have gathered, Microsoft in 2006 was using spanning trees, clustering, broad peering, hot stand by routing, virtual local area networks, DNS global load balancing, open shortest path first routing, and other advanced methods in its four tier set up.

The drawbacks to the approach are its cost and complexity. By the time this diagram was prepared, Microsoft had not embraced the type of approach that Google was following. In fact, Google’s architecture has had little impact on the approach Microsoft took in the period between 1999 and 20007. What’s interesting is that Jim Gray in his Three Talks makes clear the benefits of the Google approach: lower costs via commodity hardware, designing so that fewer engineers are needed to maintain the system, and finding solutions to known bottlenecks and hot spots without recourse to exotic, expensive hardware.

Google in the period between 1999 to 2006 emerged as the dominant player in Web search, and it began expanding its services to consumers and the enterprise without altering its basic engineering approach ably described by Jim Gray in 1999.

It is difficult for me to grasp these documented facts. Microsoft followed a path that increased its costs for online services over a period of seven years without narrowing the gap with Google. In fact, in this pivotal 84 months, Google moved from start up to disrupter using technical systems and methods known to Microsoft.

I remember watching my grandfather cut plywood using a U-shaped saw with a thin ribbon blade. I asked him, “Grandpa, why don’t you use a power tool.” He said, “I have always cut plywood this way, so there’s no reason to change.”

In my opinion, Microsoft’s approach between 1999 and 2006 was like my grandfather’s. The company knew about options in online architecture. Microsoft’s managers choose to keep the DNA from 1999 intact. The problem for Microsoft is that Google used a laser cutting tool, not a manual U-shaped plywood saw. My grandfather was not competing with anyone. His ways were correct for a crafts person in 1950. Microsoft, as we now know, faces a different challenge in its data center architecture, and it is not 1950.

What Was Running on This Architecture in 2006

In mid-2006, Microsoft had six Internet data centers and three content delivery partnerships. The company supported more than 120 Web sites, more than 2000 databases, and thousands of applications. In mid-2006, Microsoft delivered about 70 million page views per day. The system supported 10,000 requests per second.

If my research is correct, Microsoft’s Web site availability improved from 99.7% in 2003 to 99.87% in 2006. There was an outage in February 2006, but the system was stable for most of 2006. The errors that eroded uptime are mostly due to content errors, connection time outs, and page time outs. In my experience this type of problem can be tough to troubleshoot. Bloated code in Web pages may be a problem. However, one must investigate the SQL Server database as another likely cause of the errors. Hardware and server issues account for only 1.3% of the total errors, which makes the troubleshooting task somewhat more difficult.

The diagram below shows the basic “cookie cutter” set up for high availability databases. This diagram comes from Messrs. Pandey’s and Wright’s presentation, so the image quality is what it is.

database set up

If you compare this approach with the 1999 diagram, little has changed. A couple of differences are interesting. First, note that log files must be copied from data center to data center. Log files ideally must be concatenated and then processed so that analyses can be both cross data center and specific to particular data centers. To  me, this log shipping suggests that Microsoft wants to concatenate log files. Because log files can grow to a significant size, Microsoft’s approach seems a traditional data warehouse approach which again is more expensive than a distributed approach and just in time querying which is gaining favor among some companies today.

Second, note that peer to peer replication is used across data centers. Peer to peer is reliable and it can be tuned to minimize bandwidth impact. However, other options exist. Google, for example, uses a combination of innovations in data management without sacrificing speed or adding computational cost to its method.

Third, note that redundancy is ensured by have a hot stand by. Again, this approach delivers reliability in the event of failure, but it is expensive. Google, in contrast, replicates data. When a failure occurs, another copy of the needed file is obtained from a server with the data. Google’s approach works around the cost issues of Microsoft’s approach.

Finally, note that Microsoft uses its own network load balancing technology. For each cluster, a minimum of three to eight servers were deployed. When a hot spot becomes evident, the solution is to add hardware, thus reducing latency. The solution works, but it can make budgeting difficult. Money must be available to acquire the needed hardware or additional hardware must be on site and ready to deploy when a hot spot occurs. The advantages of this approach include familiarity with the approach and NLB management is built into Windows servers. The downsides include switch overhead, latency in layer switching, and connection overheads.

Other interesting aspects of this architecture include:

  • Microsoft wanted to achieve a light out operation with a minimal staff assigned to a data center
  • Improving uptime
  • Use of Microsoft’s own products or what Redmonians describe as “eating their own dog food.

In my opinion, the approach to database services is largely unchanged from the time of Jim Gray’s presentation of the data centers in 1999 and the 2006 date for these diagrams.


Let me wrap up this essay with several observations:

  1. Microsoft’s approach is an expensive one. Expensive in this context means that [a] Microsoft uses brand name, top quality gear and lots of it; [b] Microsoft consumes bandwidth within the architecture to keep data centers and repositories synchronized and makes extra investments in content delivery in order to reduce latency for routine requests; and [c] lots of engineers, administrators, and technical staff are needed. Each “island” has to have at least one person who manages the servers dedicated to a specific function. A SQL Server engineer must baby sit SQL Server at each database. That engineer may lack the knowledge to troubleshoot virtual LANS.
  2. The architecture is really complex. Keep in mind that data center architecture is a tough problem for engineers. But Microsoft has decided to extend the 1999 model, retain scale up and scale out, and use hardware to resolve known bottlenecks such as flooding SQL Server with transactions. The solution is not to rethink data management and databases. The solution is to front end each cluster with more hardware, then add hardware to each of the SQL Server clusters AND keep spare servers on stand by. This in engineering over a problem, not engineering around a problem.
  3. The message traffic across and within each “island” is significant. Then “islands” must communicate with other data centers’ islands. Plus, information has to propagated upwards thorough the four tiers in the architecture. Performance problems at peak load are, in my opinion, going to be an issue as traffic increases. In this model, the fix is more hardware, more caching, and more data centers. Obviously, adding a data center increases the message flow, so the net benefits of a new data center may be lost as more data centers come on line. At some point the message traffic will exceed the system’s ability to do any work other than move data and exchange messages, log files, and updates.

The 2006 architecture warrants further analysis. I invite my two or three Web log readers to agree or disagree with my high level discussion of this subject. Also, if a reader has the missing DNABlueprint document from 1999, please let me know at seaky2000 at yahoo dot come.

Stephen Arnold, July 13, 2008

    Scale up — Add servers to single node

    Scale out — Add more nodes


    2 Responses to “ in 2006”

    1. Amazon: Server to Server Chattiness : Beyond Search on July 27th, 2008 7:35 am

      […] in my write up, Microsoft “threw hardware at the problem”. You can read this essay here.  Redmond then implemented a variety of exotic and expensive mechanisms to keep servers in sync, […]

    2. Plumbing Master, Plumbing Apprentice : Beyond Search on June 28th, 2009 2:23 pm

      […] The Google plumbing is more homogeneous than Microsoft’s. You can see what I mean by looking at the diagram in my write up here. […]

    • Archives

    • Recent Posts

    • Meta