Search: The Three Curves of Despair

March 27, 2008

For my 2005 seminar series “Search: How to Deliver Useful Results within Budget”, I created a series of three line charts. One of the well-kept secrets about behind-the-firewall search is that costs are difficult, if not impossible, to control. That presentation is not available on my Web site archive, and I’m not sure I have a copy of the PowerPoint deck at hand. I did locate the Excel sheet for the chart which appears below. I thought it might be useful to discuss the data briefly and admittedly in an incomplete way. (I sell information for a living, so I instinctively hold some back to keep the wolves from my log cabin’s door here in rural Kentucky.)

Let me be direct: Well-dressed MBAs and sallow financial mavens simply don’t believe my search cost data.

At my age, I’m used to this type of uninformed skepticism or derisory denial. The information technology professionals attending my lectures usually smirk the way I once did as a callow nerd. Their reaction is understandable. And I support myself by my wits. When these superstars lose their jobs, my flabby self is unscathed. My children are grown. The domicile is safe from creditors. I’m offering information, not re-jigging inflated egos.

Now scan these three curves.

thesearchcurves

© Stephen E. Arnold, 2002-2008.

You see a gray line. That is the precision / recall curve. This refers to a specific method of determining if a query returns results germane to the user’s query and another method for figuring out how much germane information the search system missed. Search and a categorical affirmative such as “all” do not make happy bedfellows. Most folks don’t know what a search system does not include. Could that be one reason why the “curves of despair” evoke snickers of disbelief? Read more

Autonomy: Leading the Search Herd with Its Positioning

March 26, 2008

Autonomy Corporation rolled out its Pan-Enterprise Search platform at a trade show in Baltimore, Maryland, on March 26, 2008.

The company has been able to stay one or two steps ahead of other behind-the-firewall search vendors since the company rolled out its “portal in a box” campaign in 1999. Autonomy was first out of the gate with its smart desktop search system Kenjin in 2000. Then Autonomy was one of the first search-and-retrieval vendors to redefine its system as a platform.

Today’s announcement gives IDOL a positioning that may force super platform vendors such as IBM, Microsoft, and Oracle to do a better job of explaining what their behind-the-firewall systems deliver to a customer.

The Sun Herald quoted Mike Lynch, founder and chief executive officer of Autonomy, as saying:

Despite standardization efforts, information is scattered across the enterprise among different vendors’ software, in different formats, and among numerous servers and laptops. Autonomy’s Pan-Enterprise Search platform is the only FRCP [Federal Rules of Civil Procedure]-compliant enterprise search platform available in the market, delivering a single unified and vendor-neutral platform for searching all [sic] file formats and media-types for legal and business purposes.”

New IDOL features in the Pan-Enterprise Search platform include video indexing, enhanced geographic clustering, and an improved relevancy method. The new approach–intent-based ranking–uses algorithms to determine a user’s intent. Autonomy asserts that its new approach matches results to the user’s context. Autonomy said it made changes to enhance system performance. A new multi-dimensional index rounds out the information platform.

Additional information about the Pan-Enterprise Search platform is available from Autonomy.

Stephen Arnold, March 26, 2008

Search Waves: Are We Living through Periodicity?

March 26, 2008

I’m fascinated with cyclical phenomena. When working on my graduate degree, I accepted a grant from Duquesne University in 1967. Located in Pittsburgh, Pennsylvania, this Jesuit university was little-c catholic. All faiths were acceptable. One of my professors and friends, Dr. Richard Oehling told me, “Where else could an orthodox Jew teach the Protestant reformation to a group of Jesuits?” Such was Duquesne.

In one course, I confronted phenomenological existentialism, then a hot concept in philosophy. Although I was busy indexing sermons in Latin using ancient mainframes, the fuzzy-wuzzy world of existential philosophy caught my attention. I had zero clue about epistemology, heuristics, and other concepts that whipped serious students of different beliefs and backgrounds into a frenzy. This philosophical banter was better at stirring up emotions than a break down on the Squirrel Hill bus.

So, what’s this got to do with search?

I’m not sure, but there’s a thesis-antithesis-synthesis dialectic rippling the fabric of acquisitions, start ups, and “old wine in new bottles” innovations I read about in news releases. Just this morning, my Google Alert service informed me of “A New Wave of Enterprise Search”. The essay appeared in the CMSwatch Trendwatch blog. The key sentence to me was, “There’s a growing movement afoot to de-throne the old guard; talk of replacing FAST and Autonomy seemed to be uttered by every vendor that wasn’t a household name.” Read more

Search: A Kitchen Sink and the Carcassonne Problem

March 25, 2008

As I worked on my keynote for the upcoming Buying and Selling eContent Conference in April 2008, I flipped through PowerPoint decks in search of examples. I came across a presentation I delivered in the summer of 2006. In that talk, I described behind-the-firewall search as following an interesting trajectory. Humans have a tendency to elaborate, embroider, and complicate.

Let me give you an example. My mother and father recently moved from their home to a condominium-style dwelling. The “space” was a blank canvas. After a year, I noticed that the white space was filled in. Some of the objects were family mementos like the hand-carved ebony elephant that has been in the Arnold family for a century. But other acquisitions were plaques identifying my mother as a “red hat lady”. My father had taped instructions for replacing the cartridge in his printer next to his flat panel monitor. In short, the white space was being filled in.

I noticed a similar “stuffing” when I was in Carcassonne, the walled city in Aude. Every square inch inside the city walls had been put to use. Read more

Vivisimo’s Founders Interview Reveals Enterprise Plans

March 24, 2008

An interview with Vivisimo’s founders, Raul Valdez-Peres and Jerome Pesenti is now available from ArnoldIT.com. You can access the full-text of the interview from ArnoldIT.com’s Search Engine Wizards speak page.

The origins of the technology and the company’s clustering technology provide new insights into the company’s success. With an infustion of $4.0 million from North Atlantic Capital in early March 2006, Messrs. Valdez-Peres and Pesenti reveal their plans to expand their presence in behind-the-firewall search.

Based in Pittsburgh, Pennsylvania, this Carnegie-Mellon University spin out plans to challenge the likes of Autonomy, Endeca, Fast Search & Transfer (Microsoft) in the enterprise market.

The full list of ArnoldIT.com’s search wizard interviews is located at ArnoldIT.com. This unique series of interviews is designed to provide those interested in behind-the-firewall search to “hear first hand” how some of today’s most innovative systems entered the market.

Stephen Arnold, March 24, 2008

New Search System for “Beyond Search”

March 23, 2008

ArnoldIT.com shifted to the Blossom hosted search solution for the Web log “Beyond Search”. Blossom Software offers various packages of search services for Web sites and Web logs as well as the company’s behind-the-firewall search system. Stuart Schram, ArnoldIT’s chief technology officer, said, “Blossom’s Web log search is a quantum leap beyond the default WordPress search function.”

The Blossom Web log search, as implemented on Beyond Search, allows fast, intuitive searching. Users can enter words or phrases and see relevance-rank hits to postings matching the query.

More information about Blossom is available at the company’s Web site at www.blossom.com. If you want to see how Blossom handles your Web log, you can sign up for a demonstration at http://www1.blossom.com/search_signup.php.

Stephen Arnold, March 23, 2008

Search: Appearances Are Deceiving

March 22, 2008

In Toronto, Ontario, several years ago, I attended a lecture in which the speaker (whose name I have forgotten) asked the audience, “What do you see?” When I saw this illustration, I saved it. My source was the University of Toronto. What do you see?

wheelsillustion

My myopic eyes see wheels that rotate. When I focus my attention on a single “wheel”, nothing moves. When I shift my vision, some wheels turn.

Search and retrieval is to some people similar to this illusion. I wish I could assure you that “search” will settle down, allow us to examine it carefully, and remain fixed if we shift our attention to another problem. I can’t. Search is a blob of digital mercury, and we are — at least for the foreseeable future — going to find that it’s elusive. Perception of the viewer “defines” search.

Why is this important?

On March 21, 2008, I spoke to a journalist who asked me, “What’s the difference between Intranet search and a company’s Web site search system?” The distinction is important because information behind-the-firewall is usually viewed as “for employees only”. There are exceptions such as a consultant or attorney who needs to examine information residing on an organization’s servers. The idea is that a user name, password, and even other types of authentication may be required to tap into invoices, customer information, marketing and sales materials, and other organization information and data. Read more

Vivisimo’s Founders Interviewed: Raul Valdes-Perez and Jerome Pesenti

March 21, 2008

In mid-March, Vivisimo received an infusion of $4 million from North Atlantic Capital. Vivisimo has emerged as a full-scale “behind the firewall” search provider. The company landed the high-profile search-and-retrieval deal with the US Federal government for USA.gov, the public-facing portal for government information. Then, the company inked a deal with Interwoven, the content management company, to provide search and content processing system for the Interwoven CMS system.

Some pundits see Vivisimo as specialist vendor. That view of the company is incorrect. My sources tell me that Vivisimo is finding itself invited to bid on a range of commercial, government, and association projects. Executives at some well-known, high-profile search firms have asked me about Vivisimo. In my experience, this means Vivisimo is doing something right.

Read more

Google Wins FCC Auction: It Comes Away Empty-Handed

March 21, 2008

Left out in the cold with no licenses in the 700 MHz spectrum auction this week, Google promises to be the company that will eventually come in from the cold, using its open-source Android operating system to enter the mobile phone market in a big way.

That is, if Verizon Wireless, which won the lion’s share of the coveted C-Block spectrum in the FCC auction, doesn’t get in Google’s way.

Here is the back story: Months before the auction got underway, Google announced it wanted to insure the FCC would set aside some spectrum for open access whereby consumers would be able to use interchangeable devices and services on the spectrum. Verizon Wireless and AT&T initially opposed the idea, but then relented as FCC Chairman Kevin Martin went along with Google’s proposal.

So now with Verizon Wireless in the proverbial catbird’s seat, it is preparing to open up its wireless network, but under its own terms. Verizon Communications’ chairman and chief executive Ivan Seidenberg this week told a developers group that the company is drawing up certification measures so non-Verizon Wireless devices and services can operate on its network by the end of he year.

“The next wave of growth will come from a whole new generation of devices,” Seidenberg told the developers. “Our goal is to make our network the on-ramp for the next phase of wireless innovation.”

The big question for Google now is whether and how difficult it will be for its Android project to develop devices and services for Verizon Wireless networks.

Google initially hailed the results of the auction in spite of the fact that it had initially bid for the C-Block spectrum, but came away empty-handed. In their blog, Google’s top lawyers, Richard Whitt and Joseph Faber, said:

“Consumers whose devices use the C-Block of spectrum soon will be able to use any wireless device they wish, and download to their devices any applications and content they wish.”

The attorneys didn’t say so, but consumers will also be able to watch lots of Google-sponsored ads on their handsets, too, even if they don’t necessarily wish to.

Teragram: SAS’s Search Launchpad

March 20, 2008

This week SAS announced that it purchased Teragram, a content processing company with deep roots in, computer science, mathematics, and blue – chip clients. If you poke around Teragram’s Web site, you learn that the company supports double byte languages. If I read the Teragram information correctly, this little-known outfit not far from Harvard Yard has proprietary technology strongly suggestive of the super – sophisticated techniques in use at IBM, Google, Microsoft, and Yahoo.

The Teragram system can match other systems advanced functions like advanced function — NLP (natural language processing)? Automatic summarization? No problem. Hosted services option? Check. Autonomy – Recommind type patten matching? Done. Attensity and Bitext style linguistic analysis? Covered. Teragram has a warehouse chock full of search and content processing goodies.

Now SAS owns this “search tech” tool box.

Teragram, founded in 1997, was a privately-held content processing company in Cambridge, Massachusetts. Two wizards — both from Luxembourg — have applied their computer science and mathematical expertise to unstructured information for more than a decade. That’s a long time in the fast-moving search and text processing sector.

I learned about Teragram when someone told me that the company was a technology provider to Fast Search & Transfer SA. Fast Search’s Dr. John Lervik is a canny technologist, and he has a good nose for solid technology.

Read more

« Previous PageNext Page »