Vertical Search: A Chill Blast from the Past

January 15, 2008

Two years ago, a prestigious New York investment banker asked me to attend a meeting without compensation. I knew my father was correct when he said, “Be a banker. That’s where the money is.” My father didn’t know Willie Sutton, but he has money insight. The day I arrived the bankers’ topic was “vertical search,” the next big money maker in search, according to the vice president who escorted me into a conference room overlooking the East River.

As I understood the notion from these financial engineers, certain parties (translation: publishers) had a goldmine of content (translation: high-value information created by staff writers and freelancers). The question asked was: “Isn’t a revenue play possible using search-and-retrieval technology and a subscription model?”

There’s only one answer that New York bankers want to hear, and that is, “I think there is an opportunity for an upside.” I repeated the catch phrase, and the five money mavens smiled. I was a good Kentucky consultant, and I had on shoes too.

My recollection is that everyone in the Park Avenue meeting room was well-groomed, scrupulously polite, and gracefully clueless about online. The folks asking me to stop by for a chat listened to me for about 60 seconds and then fired questions at me about Web 2.0 technology (which I don’t fully grasp), online stickiness (which means repeat visitors and time spent on a Web site), and online revenue growth (which I definitely understand after getting whipsawed with costs in 1993 when I was involved with The Point (Top 5% of the Internet). Note: we sold this site to Lycos in 1995, and I vowed not to catch spreadsheet fever again. Spreadsheet fever is particularly contagious in the offices of New York banks.

This morning — Tuesday, January 15, 2008 — I read a news story about Convera’s vertical search solution. The article explained that Lloyd’s List , a portal reporting the doings in the shipping industry, was going online with a “vertical search solution.”

The idea, as I understand it, is that a new online service called Maritime Answers will become available in the future. Convera Corporation, a one-time big dog in the search-and-retrieval sled races, would use its “technical expertise to provide a powerful search tool for the shipping community.” (Note: in this essay I am not discussing the sale of Convera’s search-and-retrieval business to Fast Search & Transfer or the capturing by Autonomy of some of Convera’s key sales professionals in 2007.)

Vertical Search Defined

In my first edition of The Enterprise Search Report, I included a section about vertical search. I cut out that material in 2003 because the idea seemed outside the scope of “behind the firewall” search. In the last five years, the notion of vertical search has continued to pop up as a way to serve the needs of a specific segment or constituency in a broader market.

Vertical search means limiting the content to a specific domain. Examples include information for attorneys. Companies in the vertical search business for lawyers include Lexis Nexis (a unit of Reed Elsevier) and Westlaw (a service absorbed into the the Thomson Corporation). A person with an interest in a specific topic, therefore, would turn to an online system with substantial information about a particular field. Examples range from the U.S. government’s health information available as Medline Plus to Game Trade Magazine with tens of thousands of other examples. One could make a good case that Web logs on a specific topic and a search box are vertical search systems.

The idea is appealing because if one looks for information on a narrow topic, a search system with information only on that topic, in theory, makes it easier to find the nugget or answer the user seeks — at least to someone who doesn’t know much about the vagaries of online information. I will return to this idea in a moment.

Commercial Databases: The Origin Vertical Search

Most readers of this Web log will have little experience with using commercial databases. The big online vendors have found themselves under siege by the Web and their own actions.

In the late 1960s when the commercial online business began with an injection of U.S. government funding, the only kind of database possible was one that was very narrow. The commercial online services offered specific collections of information on very narrow topics or information confined to a specific technical discipline. By 1980, there were some general business databases available, but these were narrowly constrained by editorial policies.

In order to make the early search-and-retrieval systems useful, database publishers (the name given to the people and companies who built databases) had to create fields or what today would be called “fields” or “XML document type definitions.” The database builders would pay indexers to put the name of the author, the title of the source, the key words from a controlled term list, and other data (now called metadata) into these fields.

The user would in 1980 pay a fee to get an account with an online vendor. Leaders a quarter century ago, mean very little to most online users today. The Googles and Microsofts of 1980 were Dialog Corporation, BRS, SDC, and a handful of others such as DataStar.

Every database or “file” on these systems was a vertical database. Users of these commercial systems would have to learn the editorial policy of a particular database; for example, ABI / INFORM or PROMT. When Dialog was king, the service offered more than 300 commercial databases, and most users picked a particular file and entered queries using a proprietary syntax. For example, to locate marketing information from the most recent update to the ABI / INFORM database one would enter into the Dialog command line: SS UD=9999 and CC=76?? and marketing. If a user wanted chemical information, the Chemical Abstracts service required the user to know the specific names and structures of chemicals.

Characteristics of These Original Vertical Databases

A peculiar characteristic of a collection of information on a topic or in a field is not understood by most users or investment bankers. The more narrow the content collection, the greater the need for a specialized vocabulary. Let me give an example. In the ABI / INFORM file it was pointless to search for the concept via the word “management.” The entire database was “about” management. Therefore, a careless query would, in theory, return a large number of hits. We, therefore, made “management” a stop word; that is, one that would not return results. We forced users to access the content via a controlled vocabulary, complete with Use For and See Also cross references. We created a business-centric classification coding scheme so a user could retrieve the marketing information using the command CC=76??.

Another attribute of vertical content or deep information on a narrow subject is that the terminology shifts. When a new development occurs in oil and gas, the American Petroleum Institute had to identify this term and take steps to map the new idea to content “about” that new subject. Let me give an example from a less specialized field than oil exploration. You know about an acquisition. The term means one company buys another. In business, however, the word takeover may be used to describe this action. In financial circles, there will be leveraged buyouts, a venture capital buyout, or a management buyout. In short, the words used to describe an acquisition evidence the power of English and the difficulty of creating a controlled vocabulary for certain fields. The paradox is that the deeper the content in detail and through time, the more complicated the jargon becomes. A failure to search for the appropriate terms means that information on the topic is not retrieved. In the search systems of yore, the string required to get the information from ABI / INFORM on acquisitions would require an explicit query with all of the terms present.

Vertical Search 2008

Convera is a company that has faced some interesting and challenging experiences. The company’s core technology was rooted in scanning paper documents, converting these documents to ASCII via optical character recognition, and then making the documents searchable via an interface. The company acquired for $33 million in 1995 ConQuest Software, developed by a former colleague of mine at Booz, Allen & Hamilton. Convera also acquired Semio’s Claude Vogel in 2002, a rocket scientist who has since left Convera. Convera from Allen & Co., a New York firm, and embarked on a journey to reinvent itself. This is an intriguing case example, and I may write about it in the future.

The name “Convera” was adopted in 2000 when Excalibur Technologies landed a deal with Intel. After the Intel deal went south about the same time a Convera deal with the NBA ran aground, the Convera name stuck. Convera in the last eight years has worked to reduce its debt, find new sources of revenue, and finally divested itself of its search-and-retrieval business, emerging as a provider of vertical search. I have not done justice to a particularly interesting case study in the hurdles companies face when those firms try to make money without a Google-type business model.

Now Convera is in the vertical search business. It uses its content acquisition technology or crawlers and parsers to build indexes. Convera has word lists for specific markets such as law enforcement and heath as well as technology that automatically indexes, classifies, and tags processed content. The company also has server farms that can provide hosted or managed search services to its customers.

Instead of competing with Google in the public Web indexing space, Convera’s business model, as I understand it, approaches a client who wants to build a vertical content collection. Convera then indexes the content of certain Web sites and any content the customer such as a publisher has. The customer pays Convera for its services. The customer either gives away access to the content collection or charges the customer a fee to access the content.

In short, Convera is in the vertical search business. The idea is that Convera’s stakeholders get money by selling services, not licensing a search-and-retrieval engine to an enterprise. Convera’s interesting history makes clear that enterprise software and joint ventures such as those with Intel can lose big money, more than $600 million give or take a couple hundred million. Obviously Convera’s original business model lacked the lift its management teams projected.

The Value of Vertical Search

The value of vertical search depends upon several factors that have nothing to do with technology. The first factor is the desire of a customer such as a publisher like Lloyd’s List to find a new way to generate growth and zip from a long-in-the-tooth information service. Publishers are in a tough spot. Most are not very good at technical foresight. More problematic, the online options can cannibalize their existing revenues. As a business segment, traditional publishing is a hostile place for 17th-century business models.

Another factor is the skill of the marketers and sales professionals. Never underestimate the value of a smooth talking peddler. Big deals can be done on the basis of charm and a dollop of FUD, fear-uncertainty-doubt.

A third element is the environmental pressures that come from companies and innovators largely indifferent to established businesses. One example is the Google-Microsoft-Yahoo activity. Each of these companies is offering online access to information mostly without direct fees to the user. The advertisers foot the bill. All three are digitizing books, indexing Web logs or social media, and working with certain third parties to offer certain information. Even Amazon is in the game with its Kindle device, online service, and courtesy fee for certain online Web log content. Executives at these companies know about the problems publishers face, but there’s not much executives at these companies can do to alter the tectonic shift underway in information access. I know I wouldn’t start a traditional magazine or newspaper even though for decades I was an executive in newspaper and publishing companies like the Courier Journal & Louisville Times and Ziff Communications.

Vertical Search: Google Style

You can create your own vertical search system now. You don’t have to pay Convera’s wizards for this service. In fact, you don’t have to know how to program or do much more than activate your browser. Google will allow anyone to create a custom search engine, which is that company’s buzzword for vertical search system. Navigate to Google’s CSE page and explore. If you want to see the service in action, navigate to Macworld’s beta.

We’ve come full circle in a sense. The original online market was only vertical search; that is, very specific collections of content on a particular topic or discipline. Then we shifted to indexing the world of information. Now, the Google system allows anyone to create a very narrow domain of content.

What’s this mean? First, I am not sure the Convera for-fee approach will be a financially rewarding as the company’s stakeholders expect. Free is tough to beat. For a publisher wanting to index proprietary content, Google will license a Google Search Appliance . With the OneBox API, it is possible to integrate the Google CSE with the content processed by the GSA. Few people recognize that Google’s approach allows a technically savvy person or one who is Googley to replicate most of the functionality on offer from the hundreds of companies competing in the “beyond search” markets.

Second, a narrow collection built on spidering a subset of Web sites, by definition, will face some cost hurdles. As costs rise, companies providing custom subsets by direct spidering and content processing will face rising costs. These costs will be controllable by cutting back on the volume of content spidered and processed. Alternatively, the quality of service or technical innovations will have to be scaled to match available resources. Either way, Google, Microsoft, and Yahoo may control the fate of the vertical search vendors.

Finally, the enthusiasm for vertical search may be predicated on misunderstanding available information. There is a big market for vertical search in law enforcement, intelligence, and pharmaceutical competitive intelligence. There may be a market in other sectors, but with a free service like Google’s getting better with each upgrade to the Google service array, I think secondary and tertiary markets may go with the lower-cost alternative.

Stakeholders in Convera don’t know the outcome of Convera’s vertical search play. One thing is certain. New York bankers are mercurial, and their good humor can disappear with a single disappointing earnings report. I will stick with the motto, “Surf on Google” and leave certain types of search investments to those far smarter than I.

Stephen E. Arnold
January 15, 2008, 10 am

Written by Stephen E. Arnold · Filed Under Vertical search

Comments

2 Responses to “Vertical Search: A Chill Blast from the Past”

Vertical Search: A Chill Blast from the Past | Online Services on January 19th, 2008 7:32 pm

[…] Original post by Stephen E. Arnold Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages. […]
Andy Black on January 31st, 2008 7:55 am

Stephen

The E-consultancy/Convera “Vertical Search Survey 2008â€³ has just been released and reveals some very interesting information. To download a free copy of the full report, click here http://www.convera.com/survey/

The survey which was circulated to members of the Association of Online Publishers (AOP), American Business Media (ABM), Internet Advertising Bureau (IAB UK) and E-consultancy’s early-adopter community of internet marketers.

CPM will be fastest-growing revenue stream for publishers in 2008.

Online revenue set to increase while print income flattens or decreases.

Content owners must ensure visibility within fragmenting digital landscape by embracing RSS, widgets and toolbars.

Publishers see vertical search as opportunity to ‘reclaim the online community from Google’.

The fastest-growing revenue streams for publishers in 2008 will be internet display advertising and online sponsorship.

Some 72% of publishers are expecting an increase in income from CPM advertising next year and 67% are predicting a rise in digital sponsorship, while print revenues are more likely to flatten or decrease. Just under two thirds (64%) are expecting a rise in paid search (PPC) revenue.

The research also highlights the need for specialist publishers to react quickly to major changes in the digital environment in order to maintain and increase their market share and visibility.

Publishers need to adapt to maximize their digital revenues at a time of shifting advertising budgets. Trends in digital marketing are leading towards a fragmentation of the online landscape and ‘atomization’ of content. Content owners have a great opportunity to increase visibility for their content through the effective use of vertical search, feeds, widgets and toolbars.

The level of uptake for feeds and customized homepages is very high among this early-adopter audience surveyed but this kind of online behavior will soon become more widespread among knowledge workers across a wider range of industries.”

Some 93% of more than 500 media and internet professionals said that they would be ‘very likely’ or ‘quite likely’ to use a search engine that focused on serving their specific business or work needs.

More than 70% of publishers perceived ‘reclaiming the online community from Google’ to be either a major benefit or a minor benefit from vertical search.

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search