Microsoft: 1999 to 2008

July 14, 2008

I have written one short post and two longer posts about Microsoft.com’s architecture for its online services. You can read each of these essays by clicking on the titles of the stories:

I want to urge each of my two or three Web log readers to validate my assertions. Not only am I an addled goose, I am an old goose. I make errors as young wizards delight in reminding me. On Friday, July 11, 2008, two of my engineers filled some gaps in my knowledge about X++, one of Microsoft’s less well-known programming languages.

the perils of complexity

The diagram shows how complexity increases when systems are designed to support solutions that do not simplify the design. Source: http://www.epmbook.com/complexity.gif

Stepping Back

As I reflected upon the information I reviewed pertaining to Microsoft.com’s online architecture, several thoughts bubbled to the surface of my consciousness:

First, I believe Microsoft’s new data centers and online architecture shares DNA with those 1999 data centers. Microsoft is not embracing the systems and methods in use at Amazon, Google, and even the hapless Yahoo. Microsoft is using its own “dog food”. While commendable, the bottlenecks have not been fully resolved. Microsoft uses scale up and scale out to make systems keep pace with user expectations of response time. One engineer who works at a company competing with Microsoft told me: “Run a query on Live.com. The response times in many cases are faster than our. The reason is that Microsoft caches everything. It works, but it is expensive.”

Second, Microsoft lacks a cohesive code base and a new one. With each upgrade, legacy code and baked in features and functions are dragged along. A good example is SQL Server. Although rewritten from the good old days with Sybase, SQL Server is not the right tool for peta-scale data manipulation chores. Alternatives exist and Amazon and Yahoo are using them. Microsoft is sticking with its RDBMS engine, and it is very expensive to replicate, cluster, back up with stand by hardware, and keep in sync. The performance challenge remains even though user experience seems as good if not better than the competition’s. In my opinion, the reliance on this particular “dog food” is akin to building a wooden power boat with unseasoned wood.

Third, in each of the essays, Microsoft’s own engineers emphasize the cost of the engineering approaches. There is no emphasis on slashing costs. The emphasis is on spending money to get the job done. In my opinion, spending money to solve problems via the scale up and scale out approach is okay as long as there are barrels of cash to throw at the problem. The better approach, in my opinion is to engineer solutions that make scaling and performance as economical as possible and direct investment at finding ways to leap frog over the well-known, long-standing problems with the Codd database model, inefficient and latency inducing message passing, and dedicated hardware for specific functions and applications then replicating these clusters. And, finally, using more hardware that is, in effect, sitting like an idle railroad car until needed. What happens when the money for these expensive approaches becomes less available?

Options

One of the thought exercises my team and I performed last week was to brainstorm a list of ways Microsoft could get off this expensive and complex approach to its online architecture. Let me offer a selected list of the ideas from our brainstorming session. Please, don’t write to tell me some of these ideas are stupid, technically impossible, or proven stupid by folks smarter than my group of four’s thoughts. If you have a better idea or an alternative one, please, use the comments section of this Web log to offer it up for discussion. I am trying to learn from this exercise, not write a dissertation for Harvard’s MBAs to take as Delphic prophecies.

  1. Buy or partner with Amazon. Dr. Werner Vogels has demonstrated that he can put lipstick on a pig. Amazon on his watch has become a player in Web services. Mr. Bezos, the world smartest man a person told me, is near Redmond and inclined to commercial success. Micro-zon is what my engineer called this tie up.
  2. Buy Yandex. The Russian search engine is better than Google’s Russian service. The Yandex engineers are as smart as Google’s engineers. Instead of trying to scale Powerset or sort out the issues with Fast Search & Transfer’s approach to Web-scale systems, go East. The engineer who came up with this idea called the merged operations Yan Soft. I thought this was an interesting idea.
  3. Clone Google. The GOOG has generated more technical information than Wall Street mavens and search system pundits realize. Microsoft can sit down, read 400 technical papers, read 340 patent applications, hire some former DEC engineers to get the bare metal information, and duplicate Google. This means BigTable, MapReduce, Google File System, Sawzall, dataspaces, programmable search engine, containers, janitors, and I’m feeling doubly lucky. Let Google sue. My reading of Google’s patent applications suggested one thing to me: Heavy duty borrowing from the research literature. This means there’s some slop in the mooring lines.
  4. Focus and fix. Instead of trying to make a pig fly, solve the problems with messaging, intra-process messaging, database and data management, and latency introduced at each of the many interfaces in applications and servers. In my opinion, this is impossible for Microsoft because of its product manager approach to software. If I still worked at Booz, Allen & Hamilton, I would be dumb enough to say, “I know how to fix this.” I’m not 32 any more, and I don’t  have a clue what to do. The DNA from 1999 is probably still pumping through the containers in Microsoft’s new Chicago data center.

In both of my Google studies here, I referenced Microsoft as lagging behind Google technically. What this series of essays forced me to do was revisit these assertions. Well, I was right in 2005 when The Google Legacy came out and I was right last year when Google Version 2.0 came out. Google and Microsoft correctly identified the problems with peta scale operation. Microsoft solved the problem using traditional data architectures and relied on improvements in CPU performance, falling storage costs, and speed ups in top drawer gizmos from Cisco, Hewlett Packard, and other vendors.

Google, on the other hand, approached the known problems as math problems. Google “cheated” because it asked the AltaVista.com and Digital Equipment Corp.engineers for their insights. Google built on what was learned from the AltaVista.com search service. Then, Messrs. Brin and Page took a clear sheet of paper and provided the solutions: commodity hardware, use radical data management techniques not RDBMS techniques to jump over RDBMS issues, and look for ways to solve heat, file read, and scaling problems via research, testing, and data analysis. Instead of calling a meeting, Google let engineers try stuff and let data prove the merits of an innovation. Data, not politics, influenced the early decisions of Google’s engineers and management.

Google used the past. Microsoft was bound to the past. Google engineered without the burden of supporting legacy functions. Microsoft worked to support legacy functions. Google used Occam’s Razor as a guiding principle. Microsoft embraced complexity. Google focused on cost control even though it spends billions on data centers. Microsoft focused on buying the best that money could buy.

The net net for me is obvious: Microsoft has to leap frog Google and quickly. To continue its present path only allows Google to move forward without significant competition. Agree? Disagree? Help me learn. Oh, include facts. Attacking me is okay because I have developed a reasonably thick skin. I do prefer solid information, and I think my two or three readers like information as well.

Stephen Arnold, July 14, 2008

Comments

Comments are closed.

  • Archives

  • Recent Posts

  • Meta