Google Search Appliance: Showing Some Fangs

August 6, 2008

Assorted wizards have hit the replay button for Google’s official description of the Google Search Appliance (GSA)

If you missed the official highlights film, here’s a recap:

  • $30,000 starting price, good for two years, “support” and 500,000 document capacity. The bigger gizmos each can handle 10 million documents. These work like Christmas tree lights. When you need more, just buy more GSAs and plug them in. This is the same type of connectivity “big Google” enjoys when it scales.
  • Group personalization; for example, marketing wizards see brochures-type information and engineers see documents with equations
  • Metadata extraction so you can search by author, department, and other discovered index points.

If you want jump right into Google’s official description, just click here. You can even watch a video about Universal Search, which is Google’s way of dancing away from the far more significant semantic functionality that will be described in a forthcoming white paper from a big consulting firm. This forthcoming report–alas–costs money and it even contains my name in very small type as a contributor. Universal Search was the PR flash created for Google’s rush Searchology conference not long after an investment bank published a detailed report of a far larger technical search initiative (Programmable Search Engine) within the Googleplex. For true Google watchers, you will enjoy Google’s analysis of complexity. The title of the video is a bit of Googley humor because when it comes to enterprise or behind the firewall search, complexity is really not that helpful. Somewhere between 50 and 75 percent of the users of a search system are dissatisfied with the search system. Complexity is one of the “problems” that Google wants to resolve with its GSA.

When you buy the upscale versions of the GSA, you can implement fail over to another GSA. GSAs can be distributed geographically as well. The GSA comes with support for various repositories such as EMC Documentum. This means that the GSA can index the Document content without custom coding. The GSAs support the OneBox API, which is an important component in Google’s enterprise strategy. With the GSA, a clever programmer can use the GSA to create Vivisimo-style federated search results, display live data from a Microsoft Exchange server so a “hit” on a person shows that person’s calendar, integrate Web and third-party commercial content with the behind-the-firewall information, and perform other important content processing tasks.

Google happily names some of its larger customers, including Adobe Systems, Kimberly-Clark, and Sunnybrook Health. The company also does not mention the deep penetration of the GSA into government agencies, police organizations, and universities.

Good “run the game plan” write ups are available from CNet here, my favorite TechCrunch with Eric Schonfeld’s readable touch here, and the “stilling hanging in there” eWeek write up here.

splash for videos

After registering for the enterprise videos, you will see this splash page. You can get more information about the upgrade to Version 5 of the GSA.

My Take

Now, here’s my take on this upgrade:

First, Google is responding to demands for better connectivity, more administrative control, and better security. With each upgrade to the GSA, Google has added features that have been available for a quarter century from outfits like Verity (now part of the Autonomy holdings). The changes are important because Google is often bad mouthed for offering a poor enterprise search solution. With this release, I am not so sure that the negatives competitors heap on these cheerful yellow boxes are warranted. This version of the GSA is better than most of the enterprise search appliances with which I am familiar and a worthy competitor where administrative and engineering resources are scarce.

Read more

Microsoft: What Now for Search?

July 24, 2008

Googzilla twitches its tail and Microsoft goes into convulsions. When I was in the management consulting game, my boss, Dr. William Sommers, talked about “hyper-actions”. The idea was that a single event or a minor event would trigger excessive reactions.

convulsions

Brain scan of a person undergoing excessive “excitement” and “over reaction”.

When I read the flows-like-water prose of Kara Swisher’s “Microsoft’s Latest Web Stumble: Kevin Johnson Out” and then her brief introduction to Mr. Steve Ballmer’s “Full Memo to the Troops about New Reorg”, I thought about Dr. Sommers’s “hyper-action” neologism. In my opinion, we are watching the twitch in Mountain View triggering via management string theory the convulsions in Redmond.

First, let me identify for you the points that jumped from screen to neurons in Ms. Swisher’s write ups.

  1. Ms. Swisher reports that Mr. Kevin Johnson was the architect behind the Yahoo buy out. I thought that the idea was cooked in Mr. Chris Liddell’s lamb-roasting pit. Obviously my sources were off base. Mr. Johnson moves to Juniper and Mr. Liddell continues to get a Microsoft paycheck. Mr. Liddell’s remarks at the March 2008 Morgan Stanley Technology Conference left me with the impression that he was being “systematic” in his analysis. Here’s one take on his remarks.
  2. Ms. Swisher’s run down of Microsoft’s actions so far in 2008 is excellent, and she reminded me that Microsoft bought aQuantive, a fact which had slipped off my radar. What has happened to aQuantive for which Microsoft paid $6 billion, more than what Microsoft paid for Fast Search & Transfer and Powerset combined. He mentioning aQuantive reminded me of those wealthy car collectors on the Speed Channel’s exotic automobile auctions. What do you do with a $1.2 million Corvette? You put it in a garage. You don’t run down to the Speedway in Harrods Creek, Kentucky, to buy a pack of chewing tobacco.
  3. Ms. Swisher turns a great phrase; specifically, “Microsoft has succeeded in burnishing its image as a Web also-ran and still has an uncertain path to change that.” I quite like the notion that a large company takes one action and succeeds in producing an opposite reaction. I think the Google folks would peg that as one of the Laws of Google Dynamics applied to Microsoft. For every action, there is a greater, opposite reaction that persists through time. (Ms. Swisher’s statement that Yahoo looks stable brought a smile to my face as well.)

Next, let me comment on the Mr. Steve Ballmer reorg memo, which will be a classic in business schools for years to come. The opening line will probably read, “Mr. Steve Ballmer, firmly in control of Microsoft, sat at his desk and looked across the Microsoft campus. He knew a bold strategic action was needed to deal with the increasing threat of Google, etc. etc.”

After the razzle dazzle about goals, the memo gets down to business:

We will out-innovate Google in key areas—we’re already seeing this in our maps and news search. Third, we are going to reinvent the search category through user experience and business model innovation. We’ll introduce new approaches that move beyond a white page with 10 blue links to provide customers with a customized view of their world. This is a long-term battle for our company—and it’s one we’ll continue to fight with persistence and tenacity.

Read more

Enterprise Search: It’s Easy but Work Is Never Done

July 17, 2008

The Burton Group caught my attention with its report describing Microsoft a couple of years ago as a superplatform. I liked the term, but the report struck me as overly enthusiastic in favor of Microsoft’s server products.

I was surprised when I saw part one  of Margie Semilof’s interview with two Burton Group consultants, Guy Creese and Larry Cannell. These folks were described as experts in content management, a discipline with a somewhat checkered history in the pantheon of enterprise software applications. You can read the first  part interview here. The interview carries a July 15, 2008, date, and I am capturing my personal thoughts on July 16, 2008. That’s my mode of operation, a euro short and a day late. Also, I am not enthusiastic about CMS experts making the jump to enterprise search expertise. The leap can be made, but it’s like jumping from the frying pan into the fire.

The interview contains a rich vein of intellectual gold or what appears to me to be sort of gold. I jotted down two points made by the Burton experts, and I wanted to offer some color around selected points. When you read the interview, your conclusions and take aways will probably differ from mine. I am an opinionated goose, so if that bothers you, quit reading now.

Let me address two points.

First, this question and answer surprised me:

Question: How much development work is require with search technology?

Answer by Guy Creese, Burton Group expert in content management: It’s pretty easy… Usually a company is up and running and can see most of its documents without trouble.

Yikes. Enterprise search dissatisfies anywhere from half to two thirds of a system’s users. Enterprise search systems are among the most troublesome enterprise applications to set up, optimize, and maintain. Even the Google Search Appliance, one of the most toaster like search solutions, takes some effort to get into fighting shape. Customization requires expertise with the OneBox API. “Seeing documents”  and finding information are two quite different functions in my experience.

Second, this question and answer ran counter to the research I conducted for the first three editions of Enterprise Search Report (2004-2006) and my most recent study Beyond Search (2008).

Search technology has some care and feeding involved. How do companies organize the various tasks?

Answer by Guy Creese, Burton Group expert in content management: This is not onerous. Companies don’t have huge armies [to do this work], but someone has to know the formats, whether to index, how quickly they refresh. If no one worries about this, then search becomes less effective. So beyond the eye candy, you have to know how to maintain and adjust your search.

“Not onerous” runs counter to the data I have gathered in surveys and focus groups. “Formats” invoke transformation. Transformation can be difficult and expensive. Hooking search into work processes requires analysis and then customization of search functions. Search that processes content in content management systems often require specialized set up, particularly when the search system indexes duplicate or versioned documents. Rich text processing, a highly desirable function, can wander off the beaten path unless customization and tuning are performed.

Observations

There are a handful of people who have a solid understanding of enterprise search. Miles Kehoe, one of the Verity wizards, is the subject of a Search Wizards Speak interview that will be published on ArnoldIT.com on July 21, 2008. His company, New Idea Engineering, has considerable expertise in search, and you can read his views on what must be done to ensure a satisfactory deployment. Another expert is my son, Erik Arnold, whose company Adhere Solutions, specializes in customizing and integrating the Google Search Appliance into enterprise environments. To my knowledge, neither Mr. Kehoe nor Mr. Arnold characterizes search as a “pretty easy” task. In fact, I can’t recall anyone in my circle of professional acquaintances describing enterprise search as “pretty easy.”

Second, I am concerned that content management systems are expanding into applications and functions that are not germane to these systems’ capabilities. For example, CMS needs search. Interwoven has struck a deal with Vivisimo to provide search that “just works” to Interwoven customers. Vivisimo has worked hard to create a seamless experience, but,  based on my sources, the initial work was not  “pretty easy”. In fact, Interwoven had a mixed track record in delivering search before hooking up with Vivisimo. But CMS vendors are also asserting that their system is social. Well, CMS allows different people to index a document. I think that’s a social and collaborative function. But social software to me suggests Digg, Twitter, and Mahalo type functionality. Implementing these technologies in a Broadvision (if it is still paddling upstream) or Vignette might take some doing.

Third, SharePoint (a favorite of Burton if I recall the superplatform document) is a polymorphic software system. Once it was a CMS. Now it is a collaboration platform just like Exchange. I think these are marketing words slapped on servers which are positioned to make sales, not solve problems.  SharePoint includes a search function, which is improving. But deploying a robust search system within SharePoint is hard in my experience. I prefer using third party software from such companies as ISYS Search Software or the use of third-party tools. ISYS, along with Coveo, offer systems that are indeed much easier to deploy, configure, and maintain than SharePoint. But planning and experience with SharePoint are necessary.

I look forward to the second part of this interesting interview with CMS experts about enterprise search. Agree? Disagree? Quack back.

Stephen Arnold, July 17, 2008

The Jab-Google Bandwagon Rolls On

June 30, 2008

Phil Wainewright, whose writing I enjoy, wrote “Google’s Culture Not Fit for Enterprise Apps.” The essay appeared on June 30, 2008. You can find the full text here. Xooglers have been picking on the search dominator, and I posted a link to a story that I thought might be a spoof here. Apparently it wasn’t based on some email feedback I received this weekend, but I remain skeptical. What I am sure about is that criticism about Google seems to be on the uptick, and I am not sure why. The company has been consistent in its behavior for years. The biggest change is the company’s increased “transparency”. Googlers are everywhere: at conferences, in the news, on Web casts. Everywhere I look, there’s the GOOG.

Now, to Mr. Wainewright. He is actually picking up on the theme that Xooglers–that is, employees who cash out, quit, or get fired–are revealing that Mother Google has some idiosyncrasies. The key paragraph for me in Mr. Wainewright’s well-written essay was:

It’s a damning indictment, and one that casts a long shadow over Google’s attempts to replace Microsoft’s pre-eminence in the office collaboration software market with its Google Apps suite. As a disruptive competitor, it doesn’t have to match Microsoft Office feature-for-feature. But if it really is unreliable and buggy as Solyanik claims — and the current outage of Feedburner’s Web analytics service lends further weight to this view — then Google doesn’t even make the grade as a business-class SaaS provider.

Let me offer three observations.

First, demographics will help Google. As Google’s push in the educational institutions increases, future graduates will be comfortable with Google and its characteristics. When these graduates enter the work force, I think some of them will continue using Google or take steps to get Google products and services into the organization. I am not sure quality will have much to do with this sell through. I think habit, loyalty, and the notion that Google is pretty good will have some impact. Ergo, the short term and today’s expectations don’t matter so much.

Second, existing enterprise applications are clunky, disappointing, and costly to deploy, customize and maintain. I think that in a deteriorating economy, Google’s approach or that of its surrogates like Salesforce.com will be good enough. If the price is right, Google has a great opportunity to be pulled into an organization. Sure, traditional outfits and information technology departments may balk. But when money is tight, Google can cut a great deal, and maybe some of those “old fashioned IT professionals” could be rationalized. The systems associated with that crowd may strike some youthful chief financial officers as problems, not solutions.

Third, the competitors in the enterprise space are struggling. Oracle is boosting prices. Microsoft is betting the farm on a polymorphic software solution that is really complicated. (If you have not seen the SharePoint placemat, take a gander. You can find it here.) IBM is a consulting firm with loyal customers who so far have been content to write huge checks for solutions, but in a lousy quarter, IBM could face some pressure from upstarts like Google and its partners.

In short, Google can be baffling. I think that as people learn more about Google, more warts and blotches will become visible. Nevertheless, the GOOG is following its own path. By definition, those who are not Googley cannot be expected to understand how the company works, what it is doing, and when it will take certain actions. The “why” is clear: To make money. The “how” is a baffler, but I think the approach the firm is taking is interesting and more disruptive than many think.

Agree? Disagree? Let me know.

Stephen Arnold, June 30, 2008

Update: July 1, 2008, 9 50 pm Eastern; A round up of Google’s woes is here.

Gilbane Chats Up a Silly Goose: The Arnold Interview

June 18, 2008

On Wednesday, June 18, 2008, I will be interviewed in front of an audience completely unaware of why a fellow from Harrod’s Creek, Kentucky, is sitting on a stage answering questions. No one is more baffled than I. Based on my knowledge of the big city, I anticipate confusion, torpor, and indifference to my comments.

In this essay, which will become available on June 18, 2008, the curious will have a reference document that summarizes my thoughts on issues about which I may be asked. There has been no dry run for this interview. The last one in which I participated–the Associated Press’s invitation-only gathering last year–left the audience with little appetite for food. Some found the beverage table a more welcome destination.

Anticipated Question 1: What’s “beyond search” mean?

In research conducted by me and others, about two-thirds of the users of an enterprise search system are dissatisfied with that system. “Beyond search” implies that we have to move to another approach because what is now available in organizations with which I and the other researchers have investigated is not well liked. Due to the cost of some systems, annoying two-thirds of the users is tantamount to getting a D or an F on a report card.

Anticipated Question 2: What’s “behind the firewall search” mean?

I wrote about the search elephant here. Many different functions involving information access are made available to an employee, contractor, or authorized user. The idea is that “behind the firewall search” is not public and made available by an organization to a select group of users. The “search elephant” refers to the many different ways in which search is understood and perceived within an organization.

Anticipated Question 3: Why are there so many search vendors and more coming each day?

There is a belief that existing systems are not tapping into what I have estimated to be a $2.5 billion market for information access in the enterprise. Entrepreneurs and people with money look at Google and think, “We should be able to make gains like that in the enterprise market.” I also think that the market itself is trying to figure out the search elephant. Buyers don’t know what is needed. When entrepreneurs, money, and confused customers with severe information access problems come together, we have the type of market place that exists today.

Anticipated Question 4: What about Microsoft and Fast Search & Transfer?

I understand that it is business as usual at Microsoft and Fast Search. For Microsoft, this means trying to get 10,000 motorboats to go in roughly the same direction. For Fast Search, the company continues to license its Enterprise Search Platform and service customers. There are many bits of grit in the working parts where Microsoft and Fast Search mesh. It is too soon to tell if these inhibitors are trivial or whether the machine will sputter, maybe stop. What I tell people is to ignore the Microsoft-Fast Search tie up, and get a solution for a SharePoint environment that works. There are good choices ranging from a lower cost solution like dtSearch to a competitively priced system from Coveo, Exalead, ISYS Search Software, or another Microsoft Certified vendor.

Anticipated Question 5: What’s the impact of the Google Search Appliance?

Many vendors will tell you that Google has delivered a second-class system. That’s not exactly true. With the OneBox API, Google has a very solid solution. The impact is that Google has about 10,000 enterprise customers. These are sales made, in many cases, under the noses of incumbent vendors. Google’s a player in the enterprise market and a serious one. I have uncovered one impactful bit of research at Google that could–note, I said, could–change the search landscape. I have tried to ask Google about this development, but the GOOG thinks I am do not merit their attention. Too bad for me, I guess.

Anticipated Question 6: What’s the impact of text processing, semantic search, and other new technologies on enterprise search?

These are hot terms that will open doors. Some vendors will make sales because of their ability to mesh trendy concepts with more traditional search.

Stephen Arnold, June 18, 2008

Microsoft’s Web Search Strategy Revealed: The Scoble Goldberg Interview

June 16, 2008

Online video does not match my mode of learning. Robert Scoble, a laurel leaf wwearerin the new world of video and text Web logs, conducted an interview with Brad Goldberg.

The interview is part of the Fast Company videos, and it is available here. The interview is remarkable, and I urge you to spend 31 minutes and listen to Brad Goldberg, General Manager of Microsoft Search Business Group.

The interview reveals useful information about the time line for Microsoft to capture market share fro9m Google and Microsoft’s ideas for differentiating itself from Google in Web search.

Surprisingly, there were no references that I could pick up to enterprise search, nor was there any indication that Mr. Goldberg was aware of the Fast Search & Transfer Web search technology which was quite good. As you may know, Fast Search withdrew from Web search in 2003, selling its AllTheWeb.com Web index to Overture. Yahoo gobbled Overture and used bits and pieces of the Fast Search technology recently. The “auto suggest” feature is still available from Yahoo’s AllTheWeb.com site. My tests suggest that today’s AllTheWeb.com uses the Yahoo Search index built by the Slurp crawler and the Fast Search technology for some of the bells and whistles on the site. The news search function is actually quite useful. If you are not familiar with it, you can try it here.

During the interview, Mr. Goldberg uses some sample queries to illustrate his claims about Live.com’s search performance, precision, and recall. I ran the “Paris” query on each of these systems, and I ran comparative queries on this Web log as well. After the interview, I took a look at the 2005 analysis of mainstream Web search systems here so I could gauge how much change has taken place in the last three years. Quick impression: Not much. You may want to perform similar as-you-listen tests. It is easy to see what search system responds most quickly, how the search results differ, and the features that each system makes available.

Three points in Mr. Goldberg’s remarks stuck in my mind. I want to mention each of these and then offer a few observations. Judging from the edgy comments to some my essays, I want you to know that you may not agree with me. That’s okay with me. Please, use the comments section to set me straight. Providing some facts to go along with your push back is helpful to me.

Key Points for Me

1. Parity or Microsoft’s Relevancy Is As Good as Google’s

Mr. Goldberg asserted that the major search services were at parity in terms of relevance and coverage. I found this notion somewhat difficult to comprehend. The data about Web search market share undermines any argument about parity which means, according to my understanding of the word “equality” or “equivalence”. I have had difficulty interpreting comments by whiz kids before, so I may be off base. My thought was that Google continues to gain market share at the expense of both Microsoft and Yahoo. The dis-parity is significant because Google, according to data mavens, accounts for 60 percent of more of user queries in the US. In Europe, the market share is higher. US search systems do not hold commanding leads in China, Korea, and other Eastern markets.

Should parity mean visual appearance, yes, Microsoft is looking more like Google. Here is the result of one of my test queries: “real estate baltimore maryland”.

googlesearch live search

On the surface these look alike. Closer inspection reveals that Google includes a canned form so I can narrow my result by location and property type. Google eliminates a step in looking for real estate in Baltimore. Microsoft’s result does not offer this feature, preferring to show “related searches”. I like the Google approach. I don’t make much use of machine-generated related queries. I have specialized tools to discern relationships in result sets.

Read more

Enterprise Information’s Missing Pieces

June 5, 2008

In 2001, I found myself on a panel talking about electronic information and enterprise search. The venue was Internet World. That’s right the once dominant trade show for the brave new world of online.

I’m not sure how I ended up on the program, but I recall I was there, facing an audience of 250 people. Put the word “Internet” on a hand lettered sign in a diner’s window and a crowd would gather. The Internet has evolved but the missing pieces in the information puzzle are still with us.

Here’s an image from my PowerPoint deck.

puzzle pieces

Web log graphics are “crunched” and the result is difficult for me to read. Let me highlight each of these nine pieces of the enterprise information puzzle.

  1. Graphical editor
  2. Database engine
  3. Version controls
  4. Site manipulation tools (that is, publishing tools)
  5. Personalization tools
  6. Search engine
  7. Administrative interface
  8. Usage tracking
  9. Security services

Nothing is missing. The nine elements are identified in the graphic, and in your own organization you have each of these functions up and running. Some puzzle pieces work better than others. These are complex sub systems and functions. Variability and unevenness are to be expected.

My point in 2001 was that each of these pieces was not fitted to the others. The parts are there, but until integration across different sub systems and functions, the puzzle is incomplete. In fact, you don’t even have a decent picture of what the integrated results will look like.

Read more

Microsoft to ‘Innovate and Disrupt in Search’–Again

May 19, 2008

My newsreader popped this info tart in front of me this morning: “Kevin Johnson’s Memo On Yahoo & Their Strategy”. The focus of Gigaom’s Web log post is a memo, allegedly by Kevin Johnson. By the time you read this, my pathetic posting will be very old news. You need to read the memo and determine for yourself it it’s the real deal.

I’m commenting because of a series of emails I exchanged this morning about Microsoft’s search strategy. Among the points I made to the eager journalist who was, as my mother used to say, an empty vessel:

  1. Microsoft is implementing reactions, not a strategy. The cause of these knee-jerk reactions: mostly the Google and a business model challenge. Cloud services are coming round the mountain, and Microsoft can hear the whistle blowing.
  2. Yahoo has some sharp people and a truck load of search systems–Inktomi, Stata Labs, AllTheWeb.com (provided by Fast Search & Transfer), Flickr’s system, Overture’s search, and more). I’ve been told the company is rushing to be more like Google, which is not perfect, obviously. But Yahoo is grossly heterogeneous, and Google is more homogeneous in architecture.
  3. Google keeps on grinding forward. In Israel a day ago, Mr. Brin referenced Google’s multi dimensional database progress. My sources tell me that it is not progress; it is a leap frog play.

So “innovate and disrupt in search” is going to boil down to tackling these problems, forcefully, squarely, and well.

First, how many search platforms will Microsoft support? SharePoint, whizzy technology from Microsoft Research, Fast Search & Transfer’s ESP, and the legacy systems that just won’t die. Each search platform is a money hog. Get too many of these critters chomping on the cash, and you will be one poor data farmer.

Second, if–and this is a big if–Microsoft cuts a deal with Yahoo, exactly how will two shot up World War I biplanes contend with Google’s F-35? Time is running out because the GOOG keeps gobbling market and mind share. It is the number one site on the Internet and the world’s top brand. Quite a one-two punch for piston powered aircraft to shoot down.

Third, Google’s business model is based on advertising. Google wants to diversify, and Mr. Brin’s comments in Israel a day ago suggest that he wants to put a rocket booster on Google Apps. Interest in cloud-based services continues to creep up, and Google is in a good position to innovate and disrupt in that sector. The company already is innovating and disrupting in search.

We’re watching a clash of cultures and business models. When Microsoft swizzled IBM in the 1980s, it was clever. Google’s not just clever; Google has the technical platform to redefine search and enterprise applications.

Mr. Johnson’s memo does little to convince me that Microsoft–with or without Yahoo–can do much to stop Googzilla from doing Googzilla-type things.

Stephen Arnold, May 19, 2008

Enterprise Search and Train Wrecks

May 7, 2008

After I completed my interview with the Intelligenx executives, I thought about one of their comments. Iqbal Talib said, “We have many clients who want a point solution, not an enterprise solution”. An executive at Avalon Consulting wrote me today and echoed the Intelligenx comment.

Enterprise search may be a train wreck for more than half of the people who use today’s most popular systems. The Big Name vendors can grouse, stomp, and sneer at this assertion. Reality: Most of these systems disappoint their licensees. When a search system “goes off the rails”, the consequences can be unexpected.

300px-Train_wreck_at_Montparnasse_1895

When an enterprise system goes off the rails, the damage is considerable. Even worse, moving the wreckage out of the way is real work. But even more difficult is earning back the confidence of the passengers.

A Case Example

A major European news organization licensed a Big Name system. The company ponied up a down payment and asked for a fast-cycle installation. After six months of dithering, the Big Name admitted that it did not have an engineer available who could perform the installation and customization the paying customer wanted.

The news organization pulled the plug. The company then licensed one of the up-and-coming systems profiled in Beyond Search. The revamped system was available in less than three weeks at a fraction of the cost for the Big Name system.

The new system works, and it has become a showcase for the news organization. For the Big Name, the loss of the account eroded already shaky finances and became the talk of cocktail parties at industry functions.

Ever wonder how much churn Big Name enterprise search vendors experience in a year? You can get a good idea by comparing the customer lists of the best-known enterprise search vendors. The overlap is remarkable because large companies work their way through the systems. Now more are turning to up-and-coming vendors’ systems. The Big Names are facing some sales push back. Take a look at the financials for publicly traded search vendors. Look for days-sales-outstanding data. Look at the cash reserves. Look at the footnotes about restating financials.

What you may find is that fancy dancing is endemic.

How Many Search Systems Does One Company Need?

What haunts me is the overlap among vendors. Early in 2003, I conducted a poll of Fortune 1000 companies. The methodology was simple: I sent an email with several basic questions to people whom I knew at 150 different large organizations. I received a response rate of about 70 percent, which was remarkable. One question I asked five years ago was, “How many enterprise search systems do you have?”

Read more

Searchenstein: Pensée d’escalier

May 1, 2008

At the Boston Search Engine Meeting, I spoke with a certified search wizard-ette. As you know, my legal eagle discourages me from proper noun extraction in my Web log essay. This means I can’t name the person, nor can I provide you with the name of her employer. You will have to conjure a face less wizard-ette from your imagination. But she’s real, very real.

Set up: the wizard-ette wanted to ask me about Lucene as an enterprise search system. But that was a nerd gambit. The real question was, “Will I be able to graft an add on to perform semantic processing or text mining system on top of Lucene and make the hybrid work?”

The answer is, “Yes but”. Most search and content processing systems are monsters. Some are tame; others are fierce. Only a handful of enterprise search systems have been engineered to be homogeneous.

I knew this wizard-ette wasn’t enthralled with a “yes but”. She wanted a definitive, simple answer. I stumbled and fumbled. Off she drifted. This short essay, then, contains my belated pensée d’escalier.

What Is a Searchenstein?

A searchenstein is a content processing or information access system that contains a great many separate pieces. These different systems, functions, and sub systems are held together with scripts; that is, digital glue or what the code jockeys call middleware. The word middleware sounds more patrician than scripts. (In my experience, a big part of the search and retrieval business reduces to word smithing.)

Searchenstein is a search and content processing system cobbled together from different parts. There are several degrees of searchensteinism. There’s a core system built to a strict engineering plan and then swaddled in bastard code. Instead of working to the original engineering plan, the MBAs running the company take the easier, cheaper, and faster path. Systems from the Big Three of enterprise search are made up of different parts, often from sources that have little knowledge or interest in the system onto which the extras will be bolted. Other vendors have an engineering plan, and the third-party components are more tastefully integrated. This is the difference between a car customization by a cash-strapped teen and the work of Los Angeles after market specialists who build specialized automobiles for the super rich.

searchenstein

This illustration shows the body parts of a searchenstein. In this type of system, it’s easy to get lost in the finger pointing when a problem occurs. Not only are the dependencies tough to figure out, it’s almost impossible to get one’s hand on the single throat to choke.

Another variant is to use many different components from the moment the company gets in the search and content processing business. The complexities of the system are carefully hidden, often in a “black box” or beneath a zippy interface. You can’t fiddle with the innards of the “black box.” The reason, according to the vendor, may be to protect intellectual property. Another reason is that the “black box” is easily destabilized by tinkering.

Read more

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta