The Goose Quacks: Arnold Endnote at Enterprise Search Summit

October 4, 2008

Editor’s Note: This is a file with a number of screen shots. If you are on a slow connection, skip this document.

One again I was batting last. I arrived the day before my talk from Europe, and I wasn’t sure what time it was or what day it was. In short, the addled goose was more off kilter than I had been in the Netherlands for my keynote at the Hartmann Utrecht conference and my meetings in Paris squished around the Utrecht gig.

I poked my head into about half of the sessions. I heard about managing search, taxonomies, business intelligence, and product pitches disguised as analyses. I’m going to be 65; I was tired; and I had heard similar talks a few days earlier in Europe. The challenges facing those involved with search are reaching a boiling point.

After dipping into the presentations, including the remarkable Ahead in the Clouds talk by Dr. Werner Vogels, top technical gun at Amazon, and some business process management razzle dazzle, I went back to the drawing board for my talk. I had just reviewed usage data that revealed that Google’s lead in Web search was nosing towards 70 percent of the search traffic. I also had some earlier cuts at the traffic data for the Top 50 Web sites. In the two hours before my talk, I fiddled with these data and produced an interesting graph of the Web usage. I did not use it in my talk, sticking with my big images snagged from Flickr. I don’t put many words on PowerPoint slides. In fact, I use them because conference organizers want a “paper”. I just send them the PowerPoint deck and give my talk using a note card which I hold in my hand or put on the podium in front of me. I hate PowerPoints.

Here’s the chart I made to see how the GOOG was doing in terms of Microsoft and Yahoo.

Source: http://blogs.zdnet.com/ITFacts/

The top six sites are where the action is. The other 44 sites are in the “long tail”. In this case, the sites out of the top 50 have few options for getting traffic. The 44 sites accounted in August 2008 for a big chunk percent of the calculated traffic, but no single site is likely to make it into the top six quickly. Google sits on top the pile and seems to be increasing its traffic each month. Google monetizes its traffic reasonably well, so it is generating $18 billion or so in the last 12 months.

In the enterprise search arena, I have only “off the record” sources. These ghostly people tell me that Google has:

  • Shipped 24, 600 Google Search Appliances. For comparison, Fast Search & Transfer prior to its purchase by Microsoft had somewhere in the neighborhood of 2,500 enterprise search platform licensees. Now, of course, Fast Search has access to the 100 million happy SharePoint customers. Who knows what the Fast Search customer count is now? Not me.
  • Become the standard for mapping in numerous government agencies, including those who don’t have signs on their buildings
  • Been signing up as many as 3,000 Google Docs users per day, excluding the 1.5 million school children who will be using Google services in New South Wales, Australia.

I debated about how to spin these data. I decided to declare, “Google has won the search battle in 2008 and probably in 2009.” Not surprisingly, the audience was disturbed with my assertion. Remember, I did not parade these data. I use pictures like this one to make my point. This illustration shows a frustrated enterprise search customer setting fire to the vendor’s software disks, documentation, and one surly consultant:

How did I build up to the conclusion that Google has won the 2008-2009 search season. Here are the main points and some of the illustrations I used in my talk.

Read more

Microsoft: Frequent Searcher Points and Free Enterprise Search

October 1, 2008

Ina Fried’s “Microsoft Still Paying People to Search” is a useful reminder that Microsoft is “still paying people to search”. You can read various wizards’ comments at Search Engine Journal, LiveSide, and others. I quite liked the approach taken by Nathania Johnson, Search Engine Watch, in her “Microsoft Launches SearchPerks; Like Credit Card Rewards, Except for Search here.” For me, the most interesting point in her write up was this passage:

Microsoft’s Frederick Savoye, senior director at Live Search, assured me that this is an incentive program that fits into their three overall pillars of search: [a] Delivering the best search results [b] Simplifying key tasks such as booking airline, travel, shopping, finding user opinions, etc.  [c] Innovating the business model. (Note: I did a bit of format tweaking to keep the passage from becoming hard to read]

The announcement comes hard on the heels of the news that Microsoft will be hurt by the financial problems sweeping through the US and threatening the European markets (more information here) and that Microsoft will make Oslo, Norway, the pivot point for its search research (more information on that here).

I wanted to offer several observations before the addled goose brain I have forgets them.

  1. Frequent flier blues. Earning points for search is a good idea. I have quite a few air miles, but the airlines change the threshold for an award or retire the miles before I can use them. I am, therefore, deeply indifferent to usage credits because of how other customer reward programs have tricked me.
  2. Can’t buy me love. I am a rental. I sell time. When someone buys my time, I love them. When that someone doesn’t pay me, I don’t love them. As long as the pay is commensurate with the work, I go along with my rent-my-time approach to business. I don’t think the dough offered for me to change my habits, the automated scripts, and the free Google crawls I run every couple of hours is sufficient for me. For a critical mass of Web users, I am skeptical. I don’t think payola will work in search, but it worked for a while in radio someone told me.
  3. Business model silliness. Google’s business model is that someone pays Google to give away services. Users of Google expect free or low cost services to avoid the ads. Giving away free services without a third party paying or just paying people to use a service is not a business model. The tactic is marketing. I see these ploys as a type of discount coupon for tires, “Buy three and get one free”. The cost of the fourth tire is covered in the markup on the first three tires, the extra charge for balancing, or the labor cost to undo the lug nuts and put the new tires on.

In the consumer Web space, Google maintains and may be incrementally increasing its market share. I think that some of the research outfits tracking Web search share report that Google is north of 65 percent of the search traffic now. I have some first hand and anecdotal data that indicate the 65 percent figure may be low. From where I sit in my Kentucky hollow with my geese, Google’s market share in Web search is close enough to two-thirds for me. With Ask.com, Microsoft, and Yahoo chopping up the remainder, paying users probably won’t have a significant impact. Users choose what to search. Once habits in online form, those habits can be tough to change. The malarkey about search being a one click easy decision does not reflect the fact that “habits, like a soft bed, are easy to fall into and hard to get out of.” That’s a quote from Miss Costello’s sixth grade classroom poster. Miss Costello was my teacher in the 1950s. Pretty accurate statement for Web search I believe, even 50 years after I first read the message.

A quick horizon scan reveals that in enterprise search, the “give away” approach to market share is keeping Microsoft in the enterprise search game. But vendors tell me that sales of their SharePoint search plug ins continue to sell. What vendors are reaping the rewards of the SharePoint search opportunity. I can’t include the dozens who play in this space but Coveo, Endeca, ISYS Search Software, and Vivisimo have told me or hinted that SharePoint represents a good market. In fact, one vendor told me that the SharePoint market is stronger since Microsoft rolled out free SharePoint, enhanced MOSS, and bought the complex Fast Search & Transfer Enterprise Search System. In one engagement, the vendor was hired quickly, replacing the incumbent Microsoft system with minimal red tape. In the enterprise search sector, where user annoyances are commonplace, Microsoft is not yet paying people to use its search system, but Microsoft may have to take more aggressive steps to keep third party vendors out of the SharePoint members-only club.

To sum up, the search game for the Web and in the enterprise are quite different. Microsoft will have to find a way to leap frog in both markets. That will take some doing. I am excited to learn what Redmond will do on both search fronts. Google, of course, has a more integrated approach to search, which I think may present both technical and cost challenges to Microsoft.

Stephen Arnold, October 1, 2008

An Exceptional Rumor: MSFT to Buy Yahoo AOL Combo

September 26, 2008

I saw this post on Venture Beat here. Then I saw a follow on story on Peter Kafka’s write up for Silicon Alley Insider here. I am delighted to point out that these writes up do not a done deal make. I find the notion fascinating, and I hope it comes to pass. Google will probably buy another dinosaur skeleton, reinstate day care, and design more lavish housing for the NASA Moffett Field Google Housing Units to celebrate. Please, read these two posts. The plan, as I understand this speculation, is that Yahoo gobbles up the wheezing AOL. I presume Yahoo will be able to work its technical magic on AOL’s infrastructure just as it did Delicious.com’s. Yahoo took two years to rewrite Delicious.com’s code, thus allowing other social sites and bookmarking services to flourish. Once the dust settles from that MBA fueled explosion, the Bain consultants will shape the package so that Microsoft can swoop in and snap up two hot properties, solve its search and portal problems, and catch up with Googzilla and chop off its tail.

When I worked at Booz, Allen & Hamilton, we called the Bain consultants Bainies. I can’t recall if we used this as a term of affection or derision. I like Bain and the work it did for Guinness just about 20 years ago. You can refresh your memory of that project here.

Let’s walk through the search and content processing implications of this hypothetical deal. I promise that I will not comment about SharePoint search, Live.com’s search, Outlook search, SQL Server search, Powerset search, or Fast Search & Transfer search.

  1. AOL has search plus some special sauce. At one time Fast Search & Transfer was laboring in the AOL vine yards. Teragram, prior to its acquisition by  SAS, was also a vendor. Two vendors are enough for Yahoo to rationalize. Heck, Yahoo is relying on Fast Search technology for its AllTheWeb.com service last I heard. The Teragram technology might be a stretch, but the Yahoo technical team will be up to the challenge. The notion of becoming part of Microsoft will put a fire in the engineers’ bellies.
  2. AOL has its portal services. Granted these overlap with Yahoo’s. There’s the issue of AOL mail, AOL messenger, and AOL’s ad deals with various third parties. Google may still have a claw in the AOL operation as well. I haven’t followed Google’s tie up with AOL since word came to me that Google thought it made a bad decision when it pumped a billion into the company.
  3. AOL has a cracker jack customer service operation. Yahoo has a pretty interesting customer service operation as well. I am not sure how one might merge the two units and bring both of them under the Yahoo natural language search system that doesn’t seem to know how to provide guidance to me when I want to cancel one of my very few Yahoo for fee services. Give this a try on your own and let me know how you navigate the system.

I am delighted that I don’t have to figure out how to mesh Yahoo and AOL and then integrate the Yahoo AOL entity with Microsoft. Overlapping services are trivial for these three firms’ engineers. No big deal. If the fix is to operate each much as they now are, I anticipate some cost control problems. Economies of scale are tough to achieve operating three separate systems and their overlapping features.

I think that when I read the stories in my newsreader on Monday, September 29, 2008, I will know more about this rumor. I am still struggling with how disparate systems and the number of search systems can be made to work better, faster, and cheaper. Maybe the owner of the Yahoo AOL property will outsource search to Google. Google is relatively homogeneous, and it works pretty well for quite a few Web users, Web advertisers, and Web watchers. Watch this Web log for clarification of this rumor. For now, the word that comes to mind is a Vista “wow”.

Stephen Arnold, September 26, 2008

Virtual Servers: It Is Recrawl and Reindex Time

September 22, 2008

The malarky about virtualization has many information technology professionals courting chimeras. Some virtualization is good. For example, we have a couple of quad core, four gigabyte servers that are four to five times faster on our benchmark tests than the aged NetFinity 5500s we retired. The new servers have the moxie to run virtualization software. No problems so far. In fact, chopping boxes into separate virtual servers makes sense and is tame compared to some of the technologies that arrive at our office door.

Virtual storage, however, is another kettle of fish. Our experience has been that complex directory structures such as those spawned by SharePoint and certain enterprise applications are complicated. When these complex structures are mixed with virtual storage, we have encountered some excitement. We test software, so our trashed files provide us with useful data, not long weekends and sleepless nights.

InfoWorld on September 19, 2008, here called attention to some of the issues virtual storage drags along with the snappy marketing messages and rah rahs for cheaper administration. “Virtual Server Backups Prone to Failure, Survey Finds” makes clear that virtual solutions are not without some problems. The InfoWorld write up reports on a survey that asserts more than half the virtual server backups don’t restore. The article has some other data but I want to focus only on the backups not restoring.

Here’s the problem. Search is a storage intensive application. The indexes can be big. If an index doesn’t start out big, in a matter of months the index gets big. Logs get big. When a search or content processing system crashes or an index update corrupts the master index, an administrator turns to the back up sytem. If the search system is using a whizzy new virtual storage system, the backup won’t work. The problem is that rebuilding the index is not always a five minute or even a five hour job.

Recrawling and reindexing can be tricky. Systems that perform significant content processing can crunch for a day,. maybe more generating metadata. Our suggestion is to skip virtual storage for search and content processing systems. Already have one? You may want to devirtualize and quickly.

Stephen Arnold, September 22, 2008

Autonomy: Quicker than Microsoft Fast Search Yet Again

September 19, 2008

Autonomy continues to out think Microsoft Fast Search. The nimbleness of Autonomy cannot be overlooked by the Redmond giant. Microsoft has revenues of $65 billion or so. Autonomy weighs in with $400 million or $500 million in revenues. Microsoft spent $1.2 billion for an enterprise search vendor which stunned the content processing world with a Web part to integrate SharePoint (a hugely complex content management system) with Fast ESP (an equally complex content processing system). Now Autonomy rolls out its “its information processing technology [that] extends Microsoft Office SharePoint Server (MOSS) to meet customer requirements for scalability, connectivity and conceptual search.” You can read the details of this new Autonomy product here.

Now if I were involved with Microsoft Fast Search, I would be tempted to say, “Those Autonomy folks have hit on a very good idea.” I might even be tempted to suggest that we buy Autonomy just to get the company’s marketing team. When you have billions in the bank and are fighting to out Google Google, why not buy Autonomy? It makes more sense than trying to weld together Powerset and Ciao.com technology in my opinion.

I don’t know who is running the Microsoft enterprise search operation. There’s been too many executive changes and too few substantive announcements to hold this addled goose’s short attention span. What’s clear is that Autonomy is able to pinpoint cracks in the Microsoft Fast Search armor and exploit them. Anyone who has any hands on experience with SharePoint knows that it’s easy to get a finger crushed in SharePoint’s moving parts. So Autonomy asserts:

Autonomy further extends global MOSS scalability through its distributed, brokered architecture and “geo-efficient” design which allows data to be automatically replicated in the most sensible location based on bandwidth, lag time, availability and demand. This enables high performance and gives users aggregated access to all enterprise information in a unified view in globally dispersed environments while reducing bandwidth overhead. Because IDOL creates a stub, or shortcut, to the data and supports tiered storage rather than requiring that data be stored in SQL Server, organizations using IDOL with SharePoint can further benefit from dramatically reduced SQL Server licenses and associated scalability limitations.

Microsoft may want to pay close attention to how Autonomy deftly points out the finger mashing gears and levers in SharePoint. Next Microsoft may want to put safety covers on the more dangerous bits. If SharePoint and Fast Search continue to dog paddle along, Autonomy and maybe other vendors will find the 100 million SharePoint users easy pickings.

A happy quack to Autonomy for this deft marketing move. A goose gift for the Redmond behemoth who seems unable to organize its parade and get it marching toward the objective of delivering a solution to the scaling problems in SharePoint. Agree? Disagree? Help keep me informed via the Comments function on this Web log.

Stephen Arnold, September 19, 2008

Why Dataspaces Matter

August 30, 2008

My posts have been whipping super-wizards into action. I don’t want to disappoint anyone over the long American “end of summer” holiday. Let’s consider a problem in information retrieval and then answer in a very brief way why dataspaces matter. No, this is not a typographical error.

Set Up

A dataspace is somewhat different from a database. Databases can be within a dataspace, but other information objects, garden variety metadata, and new types of metadata which I like to call meta metadata, among others can be encompassed. These are represented in an index. For our purpose, we don’t have to worry about the type of index. We’re going to look up something in any of the indexes that represent our dataspace. You can learn more about dataspaces in the IDC report #213562, published on August 28, 2008. It’s a for fee write up, and I don’t have a copy. I just contribute; I don’t own these analyses published by blue chip firms.

Now let’s consider an interesting problem. We want to index people, figure out what those people know about, and then generate results to a query such as “Who’s an expert on Google?” If you run this query on Google, you get a list of hits like this.

google expert

This is not what I want. I require a list of people who are experts on Google. Does Live.com deliver this type of output? Here’s the same query on the Microsoft system:

live expert output

Same problem.

Now let’s try the query on Cluuz.com, a system that I have written about a couple of times. Run the query “Jayant Madhavan” and I get this:

cluuz

I don’t have an expert result list, but I have a wizard and direct links to people Dr. Madhavan knows. I can make the assumption that some of these people will be experts.

If I work in a company, the firm may have the Tacit system. This commercial vendor makes it possible to search for a person with expertise. I can get some of this functionality in the baked in search system provided with SharePoint. The Microsoft method relies on the number of documents a person known to the system writes on a topic, but that’s better than nothing. I could if I were working in a certain US government agency use the MITRE system that delivers a list of experts. The MITRE system is not one whose screen shots I can show, but if you have a friend in a certain government agency, maybe you can take a peek.

None of these systems really do what I want.

Enter Dataspaces

The idea for a dataspace is to process the available information. Some folks call this transformation, and it really helps to have systems and methods to transform, normalize, parse, tag, and crunch the source information. It also helps to monitor the message traffic for some of that meta metadata goodness. An example of meta metadata is an email. I want to index who received the email, who forwarded the email to whom and when, and any cutting or copying of the information in the email to which documents and the people who have access to said information. You get the idea. Meta metadata is where the rubber meets the road in determining what’s important regarding information in a dataspace.

Read more

The Future of Search Layer Cake

August 14, 2008

Yesterday I contributed a short essay about the future of search. I thought I was being realistic for the readers of AltSearchEngines.com, a darn good Web log in my opinion. I wanted to be more frisky than the contributions from SearchEngineLand.com and Hakia.com too. I’m not an academic, and I’m not in the search engine business. I do competitive technical analysis for a living. Search is a side interest, and prior to my writing the Enterprise Search Report, no one had taken a comprehensive look at a couple dozen of the major vendors. I now have profiles on 52 companies, and I’m adding a new one in the next few days. I don’t pay much attention to the university information retrieval community because I’m not smart enough to figure out the equations any more.

From the number of positive and negative responses that have flowed to me, I know I wasn’t clear about my focus on behind the firewall search and Google’s enterprise activities. This short post is designed to put my “layer cake” image into context. If you want to read the original essay on AltSearchEngines.com, click here. To refresh your memory, here’s the diagram, which in one form or another I have been using in my lectures for more than a decade. I’m a lousy teacher, and I make mistakes. But I have a wealth of hands on experience, and I have the research under my belt from creating and maintaining the 52 profiles of companies that are engaged in commercial search, content processing, and text analytics.

search future

I’ve been through many search revolutions, and this diagram explains how I perceive those innovations. Furthermore, the diagram makes clear a point that many people do not fully understand until the bills come in the mail. Over time search gets more expensive. A lot more expensive. The reason is that each “layer” is not necessarily a system from a single vendor. The layers show that an organization rarely rips and replaces existing search technology. So, no matter how lousy a system, there will be two or three or maybe a thousand people who love the old system. But there may be one person or 10,000 who want different functionality. The easy path for most organizations is to buy another search solution or buy an “add in” or “add on” that in theory brings the old system closer to the needs of new users or different business needs.

Read more

Google Search Appliance: Showing Some Fangs

August 6, 2008

Assorted wizards have hit the replay button for Google’s official description of the Google Search Appliance (GSA)

If you missed the official highlights film, here’s a recap:

  • $30,000 starting price, good for two years, “support” and 500,000 document capacity. The bigger gizmos each can handle 10 million documents. These work like Christmas tree lights. When you need more, just buy more GSAs and plug them in. This is the same type of connectivity “big Google” enjoys when it scales.
  • Group personalization; for example, marketing wizards see brochures-type information and engineers see documents with equations
  • Metadata extraction so you can search by author, department, and other discovered index points.

If you want jump right into Google’s official description, just click here. You can even watch a video about Universal Search, which is Google’s way of dancing away from the far more significant semantic functionality that will be described in a forthcoming white paper from a big consulting firm. This forthcoming report–alas–costs money and it even contains my name in very small type as a contributor. Universal Search was the PR flash created for Google’s rush Searchology conference not long after an investment bank published a detailed report of a far larger technical search initiative (Programmable Search Engine) within the Googleplex. For true Google watchers, you will enjoy Google’s analysis of complexity. The title of the video is a bit of Googley humor because when it comes to enterprise or behind the firewall search, complexity is really not that helpful. Somewhere between 50 and 75 percent of the users of a search system are dissatisfied with the search system. Complexity is one of the “problems” that Google wants to resolve with its GSA.

When you buy the upscale versions of the GSA, you can implement fail over to another GSA. GSAs can be distributed geographically as well. The GSA comes with support for various repositories such as EMC Documentum. This means that the GSA can index the Document content without custom coding. The GSAs support the OneBox API, which is an important component in Google’s enterprise strategy. With the GSA, a clever programmer can use the GSA to create Vivisimo-style federated search results, display live data from a Microsoft Exchange server so a “hit” on a person shows that person’s calendar, integrate Web and third-party commercial content with the behind-the-firewall information, and perform other important content processing tasks.

Google happily names some of its larger customers, including Adobe Systems, Kimberly-Clark, and Sunnybrook Health. The company also does not mention the deep penetration of the GSA into government agencies, police organizations, and universities.

Good “run the game plan” write ups are available from CNet here, my favorite TechCrunch with Eric Schonfeld’s readable touch here, and the “stilling hanging in there” eWeek write up here.

splash for videos

After registering for the enterprise videos, you will see this splash page. You can get more information about the upgrade to Version 5 of the GSA.

My Take

Now, here’s my take on this upgrade:

First, Google is responding to demands for better connectivity, more administrative control, and better security. With each upgrade to the GSA, Google has added features that have been available for a quarter century from outfits like Verity (now part of the Autonomy holdings). The changes are important because Google is often bad mouthed for offering a poor enterprise search solution. With this release, I am not so sure that the negatives competitors heap on these cheerful yellow boxes are warranted. This version of the GSA is better than most of the enterprise search appliances with which I am familiar and a worthy competitor where administrative and engineering resources are scarce.

Read more

Microsoft: What Now for Search?

July 24, 2008

Googzilla twitches its tail and Microsoft goes into convulsions. When I was in the management consulting game, my boss, Dr. William Sommers, talked about “hyper-actions”. The idea was that a single event or a minor event would trigger excessive reactions.

convulsions

Brain scan of a person undergoing excessive “excitement” and “over reaction”.

When I read the flows-like-water prose of Kara Swisher’s “Microsoft’s Latest Web Stumble: Kevin Johnson Out” and then her brief introduction to Mr. Steve Ballmer’s “Full Memo to the Troops about New Reorg”, I thought about Dr. Sommers’s “hyper-action” neologism. In my opinion, we are watching the twitch in Mountain View triggering via management string theory the convulsions in Redmond.

First, let me identify for you the points that jumped from screen to neurons in Ms. Swisher’s write ups.

  1. Ms. Swisher reports that Mr. Kevin Johnson was the architect behind the Yahoo buy out. I thought that the idea was cooked in Mr. Chris Liddell’s lamb-roasting pit. Obviously my sources were off base. Mr. Johnson moves to Juniper and Mr. Liddell continues to get a Microsoft paycheck. Mr. Liddell’s remarks at the March 2008 Morgan Stanley Technology Conference left me with the impression that he was being “systematic” in his analysis. Here’s one take on his remarks.
  2. Ms. Swisher’s run down of Microsoft’s actions so far in 2008 is excellent, and she reminded me that Microsoft bought aQuantive, a fact which had slipped off my radar. What has happened to aQuantive for which Microsoft paid $6 billion, more than what Microsoft paid for Fast Search & Transfer and Powerset combined. He mentioning aQuantive reminded me of those wealthy car collectors on the Speed Channel’s exotic automobile auctions. What do you do with a $1.2 million Corvette? You put it in a garage. You don’t run down to the Speedway in Harrods Creek, Kentucky, to buy a pack of chewing tobacco.
  3. Ms. Swisher turns a great phrase; specifically, “Microsoft has succeeded in burnishing its image as a Web also-ran and still has an uncertain path to change that.” I quite like the notion that a large company takes one action and succeeds in producing an opposite reaction. I think the Google folks would peg that as one of the Laws of Google Dynamics applied to Microsoft. For every action, there is a greater, opposite reaction that persists through time. (Ms. Swisher’s statement that Yahoo looks stable brought a smile to my face as well.)

Next, let me comment on the Mr. Steve Ballmer reorg memo, which will be a classic in business schools for years to come. The opening line will probably read, “Mr. Steve Ballmer, firmly in control of Microsoft, sat at his desk and looked across the Microsoft campus. He knew a bold strategic action was needed to deal with the increasing threat of Google, etc. etc.”

After the razzle dazzle about goals, the memo gets down to business:

We will out-innovate Google in key areas—we’re already seeing this in our maps and news search. Third, we are going to reinvent the search category through user experience and business model innovation. We’ll introduce new approaches that move beyond a white page with 10 blue links to provide customers with a customized view of their world. This is a long-term battle for our company—and it’s one we’ll continue to fight with persistence and tenacity.

Read more

Enterprise Search: It’s Easy but Work Is Never Done

July 17, 2008

The Burton Group caught my attention with its report describing Microsoft a couple of years ago as a superplatform. I liked the term, but the report struck me as overly enthusiastic in favor of Microsoft’s server products.

I was surprised when I saw part one  of Margie Semilof’s interview with two Burton Group consultants, Guy Creese and Larry Cannell. These folks were described as experts in content management, a discipline with a somewhat checkered history in the pantheon of enterprise software applications. You can read the first  part interview here. The interview carries a July 15, 2008, date, and I am capturing my personal thoughts on July 16, 2008. That’s my mode of operation, a euro short and a day late. Also, I am not enthusiastic about CMS experts making the jump to enterprise search expertise. The leap can be made, but it’s like jumping from the frying pan into the fire.

The interview contains a rich vein of intellectual gold or what appears to me to be sort of gold. I jotted down two points made by the Burton experts, and I wanted to offer some color around selected points. When you read the interview, your conclusions and take aways will probably differ from mine. I am an opinionated goose, so if that bothers you, quit reading now.

Let me address two points.

First, this question and answer surprised me:

Question: How much development work is require with search technology?

Answer by Guy Creese, Burton Group expert in content management: It’s pretty easy… Usually a company is up and running and can see most of its documents without trouble.

Yikes. Enterprise search dissatisfies anywhere from half to two thirds of a system’s users. Enterprise search systems are among the most troublesome enterprise applications to set up, optimize, and maintain. Even the Google Search Appliance, one of the most toaster like search solutions, takes some effort to get into fighting shape. Customization requires expertise with the OneBox API. “Seeing documents”  and finding information are two quite different functions in my experience.

Second, this question and answer ran counter to the research I conducted for the first three editions of Enterprise Search Report (2004-2006) and my most recent study Beyond Search (2008).

Search technology has some care and feeding involved. How do companies organize the various tasks?

Answer by Guy Creese, Burton Group expert in content management: This is not onerous. Companies don’t have huge armies [to do this work], but someone has to know the formats, whether to index, how quickly they refresh. If no one worries about this, then search becomes less effective. So beyond the eye candy, you have to know how to maintain and adjust your search.

“Not onerous” runs counter to the data I have gathered in surveys and focus groups. “Formats” invoke transformation. Transformation can be difficult and expensive. Hooking search into work processes requires analysis and then customization of search functions. Search that processes content in content management systems often require specialized set up, particularly when the search system indexes duplicate or versioned documents. Rich text processing, a highly desirable function, can wander off the beaten path unless customization and tuning are performed.

Observations

There are a handful of people who have a solid understanding of enterprise search. Miles Kehoe, one of the Verity wizards, is the subject of a Search Wizards Speak interview that will be published on ArnoldIT.com on July 21, 2008. His company, New Idea Engineering, has considerable expertise in search, and you can read his views on what must be done to ensure a satisfactory deployment. Another expert is my son, Erik Arnold, whose company Adhere Solutions, specializes in customizing and integrating the Google Search Appliance into enterprise environments. To my knowledge, neither Mr. Kehoe nor Mr. Arnold characterizes search as a “pretty easy” task. In fact, I can’t recall anyone in my circle of professional acquaintances describing enterprise search as “pretty easy.”

Second, I am concerned that content management systems are expanding into applications and functions that are not germane to these systems’ capabilities. For example, CMS needs search. Interwoven has struck a deal with Vivisimo to provide search that “just works” to Interwoven customers. Vivisimo has worked hard to create a seamless experience, but,  based on my sources, the initial work was not  “pretty easy”. In fact, Interwoven had a mixed track record in delivering search before hooking up with Vivisimo. But CMS vendors are also asserting that their system is social. Well, CMS allows different people to index a document. I think that’s a social and collaborative function. But social software to me suggests Digg, Twitter, and Mahalo type functionality. Implementing these technologies in a Broadvision (if it is still paddling upstream) or Vignette might take some doing.

Third, SharePoint (a favorite of Burton if I recall the superplatform document) is a polymorphic software system. Once it was a CMS. Now it is a collaboration platform just like Exchange. I think these are marketing words slapped on servers which are positioned to make sales, not solve problems.  SharePoint includes a search function, which is improving. But deploying a robust search system within SharePoint is hard in my experience. I prefer using third party software from such companies as ISYS Search Software or the use of third-party tools. ISYS, along with Coveo, offer systems that are indeed much easier to deploy, configure, and maintain than SharePoint. But planning and experience with SharePoint are necessary.

I look forward to the second part of this interesting interview with CMS experts about enterprise search. Agree? Disagree? Quack back.

Stephen Arnold, July 17, 2008

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta