Scaling SharePoint Could Be Easy

September 24, 2009

Back in the wonderful city of Washington, DC. I participated in a news briefing at the National Press Club today (September 23, 2009). The video summary of the presentations will be online next week. During the post briefing discussion, the topic of scaling SharePoint came up. The person with whom I was speaking sent me a link when she returned to her office. I read “Plan for Software Boundaries (Office SharePoint Server)” and realized that this Microsoft Certified Professional was jumping through hoops created by careless system design. I don’t think the Google enterprise applications are perfect, but Google has eliminated the egregious engineering calisthenics that Microsoft SharePoint delivers as part of the standard software.

I can deal with procedures. What made me uncomfortable right off the bat was this segment in the TechNet document:

- In most circumstances, to enhance the performance of Office SharePoint Server 2007, we discourage the use of content databases larger than 100 GB. If your design requires a database larger than 100 GB, follow the guidance below:
  - Use a single site collection for the data.
  - Use a differential backup solution, such as SQL Server 2005 or Microsoft System Center Data Protection Manager, rather than the built-in backup and recovery tools.
  - Test the server running SQL Server 2005 and the I/O subsystem before moving to a solution that depends on a 100 GB content database.
- Whenever possible, we strongly advise that you split content from a site collection that is approaching 100 GB into a new site collection in a separate content database to avoid performance or manageability issues.

Why did I react strongly to these dot points? Easy. Most of the datasets with which we wrestle are big, orders of magnitude larger than 100 Gb. Heck, this cheap net book I am using to write this essay has a 120 Gb solid state drive. My test corpus on my desktop computer weighs in at 500 Gb. Creating 100 Gb subsets is not hard, but in today’s petascale data environment, these chunks seem to reflect what I would call architectural limitations.

As I worked my way through the write up, I found numerous references to hard limits. One example was this statement from a table:

Office SharePoint Server 2007 supports 50 million documents per index server. This could be divided up into multiple content indexes based on the number of SSPs associated with an index server.

I like the “could be.” That type of guidance is useful, but my question is, “Why not address the problem instead of giving me the old “could be”? We have found limits in the Google Search Appliance, but the fix is pretty easy and does not require any “could be” engineering. Just license another GSA and the system has been scaled. No caveats.

I hope that the Fast ESP enterprise search system tackles engineering issues, not interface (what Microsoft calls user experience). In order to provide information access, the system has to be able to process the data the organization needs to index. Asking my team to work around what seem to be low ceilings is extra work for us. The search system needs to make it easy to deliver what the users require. This document makes clear that the burden of making SharePoint search falls on me and my team. Wrong. I want the system to lighten my load, not increase it with “could be” solutions.

Stephen Arnold, September 24, 2009

Written by Stephen E. Arnold · Filed Under Enterprise, News, Security, SharePoint

Comments

One Response to “Scaling SharePoint Could Be Easy”

katheryndis on November 24th, 2009 12:22 pm

Sorry, for off top, i wanna tell one joke) Why did the garbage look sad? Because it was down in the dumps.
___________________________
–/ vaiagra Minnesota /–

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.