Software and Smart Content

October 30, 2011

I was moving data from Point A to Point B yesterday, filtering junk that has marginal value. I scanned a news story from a Web site which covers information technology with a Canadian perspective. The story was “IBM, Yahoo turn to Montreal’s NStein to Test Search Tool.” In 2006, IBM was a pace-setter in search development cost control The company was relying on the open source community’s Lucene technology, not the wild and crazy innovations from Almaden and other IBM research facilities. Web Fountain and jazzy XML methods were promising ways to make dumb content smart, but IBM needed a way to deliver the bread-and-butter findability at a sustainable, acceptable cost. The result was OmniFind. I had made a note to myself that we tested the Yahoo OmniFind edition when it became available and noted:

Installation was fine on the IBM server. Indexing seemed sluggish. Basic search functions generated a laundry list of documents. Ho hum.

Maybe this comment was unfair, but five years ago, there were arguably better search and retrieval systems. I was in the midst of the third edition of the Enterprise Search Report, long since batardized by the azure chip crowd and the “real” experts. But we had a test corpus, lots of hardware, and an interest is seeing for ourselves how tough it was to get an enterprise search system up and running. Our impression was that most people would slam in the system, skip the fancy stuff, and move on to more interesting things such as playing Foosball.

Thanks to Adobe for making software that creates a need for Photoshop training. Source: http://www.practical-photoshop.com/PS2/pages/assign.html

Smart, Intelligent… Information?

In this blast from the past article, NStein’s product in 2006 was “an intelligent content management product used by media companies such as Time Magazine and the BBC, and a text mining tool called NServer.” The idea was to use search plus a value adding system to improve the enterprise user’s search experience.

Now the use of the word “intelligent” to describe a content processing system, reaching back through the decades to computer aided logistics and forward to the Extensible Markup Language methods.

The idea of “intelligent” is a pregnant one, with a gestation period measured in decades.

Flash forward to the present. IBM markets OmniFind and a range of products which provide basic search as a utility function. NStein is a unit of OpenText, and it has been absorbed into a conglomerate with a number of search systems. The investment needed to update, enhance, and extend BASIS, BRS Search, NStein, and the other systems OpenText “sells” is a big number. “Intelligent content” has not been an OpenText buzzword for a couple of years.

The torch has been passed to conference organizers and a company called Thoora, which “combines aggregation, curation, and search for personalized news streams.” You can get some basic information in the TechCrunch article “Thoora Releases Intelligent Content Discovery Engine to the Public.”

In two separate teleconference calls last week (October 24 to 28, 2011), “intelligent content” came up. In one call, the firm was explaining that traditional indexing system missed important nuances. By processing a wide range of content and querying a proprietary index of the content, the information derived from the content would be more findable. When a document was accessed, the content was “intelligent”; that is, the document contained value added indexing.

The second call focused on the importance of analytics. The content processing system would ingest a wide range of unstructured data, identify items of interest such as the name of a company, and use advanced analytics to make relationships and other important facets of the content visible. The documents were decomposed into components, and each of the components was “smart”. Again the idea is that the fact or component of information was related to the original document and to the processed corpus of information.

No problem.

Shift in Search

We are witnessing another one of those abrupt shifts in enterprise search. Here’s my working hypothesis. (If you harbor a life long love of marketing baloney, quit reading because I am gunning for this pressure point.)

Let’s face it. Enterprise search is just not revving the engines of the people in information technology or the chief financial officer’s office. Money pumped into search typically generates a large number of user complaints, security issues, and cost spikes. As content volume goes up, so do costs. The enterprise is not Google-land, and money is limited. The content is quite complex, and who wants to try and crack 1990s technology against the nut of 21st century data flows. Not I. So something hotter is needed.

Second, the hottest trends in “search” have nothing to do with search whatsoever. Examples range from conflating the interface with precision and recall. Sorry. Does not compute for me. The other angle is “mobile.” Sure, search will work when everything is monitored and “smart” software provides a statistically appropriate method suggests will work “most” of the time. There is also the baloney about apps, which is little more than the gameification of what in many cases might better be served with a system that makes the user confront actual data, not an abstraction of data. What this means is that people are looking for a way to provide information access without having to grunt around in the messy innards of editorial policies, precision, recall, and other tasks that are intellectually rigorous in a way that Angry Birds interfaces for business intelligence are not.

Third, companies engaged in content access are struggling for revenue. Sure, the best of the search vendors have been purchased by larger technology companies. These acquisitions guarantee three things.

The Wild West spirit of the innovative content processing vendors is essentially going to be stamped out. Creativity will be herded into the corporate killing pens, and the “team” will be rendered as meat products for a technology McDonald’s
The cash sink holes that search vendors research programs were will be filled with procedure manuals and forms. There is no money for blue sky problem solving to crack the tough problems in information retrieval at a Fortune 1000 company. Cash can be better spent on things that may actually generate a return. After all, if the search vendors were so smart, why did most companies hit revenue ceilings and have to turn to acquisitions to generate growth? For firms unable to grow revenues, some just fiddled the books. Others had to get injections of cash like a senior citizen in the last six months of life in a care facility. So acquired companies are not likely to be hot beds of innovation.
The pricing mechanisms which search vendors have so cleverly hidden, obfuscated, and complexified will be tossed out the window. When a technology is a utility, then giant corporations will incorporate some of the technology in other products to make a sale.

What we have, therefore, is a search marketplace where the most visible and arguably successful companies have been acquired. The companies still in the marketplace now have to market like the Dickens and figure out how to cope with free open source solutions and giant acquirers who will just give away search technology.

Enter “smart content,” “intelligent content”, “predictive content”, and similar phrases. IBM is an excellent source of information wordsmithing. Here’s what IBM asserts about “business analytics software”:

IBM Cognos® Business Intelligence delivers a revolutionary new user experience and expands traditional business intelligence (BI) with planning, scenario modeling, real-time monitoring and predictive analytics. Using a limitless BI workspace that supports how people think and work—in the office, on the go and even offline—you can interact with, search and assemble all perspectives of your business. Built on a proven technology platform, Cognos Business Intelligence is designed to upgrade seamlessly and to scale cost-effectively for the broadest of deployments.

With IBM Cognos Business Intelligence software you can:

See results, understand what drives the numbers and set targets.

Identify and analyze opportunities and trends.

Author and share information the way you want to—in reports, dashboards and scorecards.

Experience insight with quick and easy access to analytics anywhere you go.

Collaborate to create a common context that every department can use for decision-making.

Deliver trusted information for a single version of the truth.

IBM Cognos Business Intelligence provides a complete range of BI capabilities to your desktop, notebook, tablet and smart phone. It helps you gain more value from your existing investments in data and infrastructure. Source: http://www-01.ibm.com/software/analytics/business-intelligence.html

What ArnoldIT Is Watching in Enterprise Search

I want to identify four “items” or “points” that my colleagues and I are monitoring. We will, of course, modify this list, but I wanted to capture the thoughts that emerged from our lunch discussion on Friday, October 28, 2011:

Software is going to be positioned as taking content from “dumb” systems and normal human writers and impart “intelligence” to the writing. Now I think this is XML mark up and indexing, but I think the words that will attract Gen X and Gen Y professionals looking to get a raise and make a mark in the world are important.
Underneath the new phraseology and the noise of the “smart” parade will be knowledge work that requires humans, is expensive to perform, and only partially completed by “ant algorithms,” “predictive analytics,” and the rest of the buzzwords flying around at this sales picnic.
Governance will become the “fix” when the “smart” and “intelligent” content methods go off the rails due to: [a] cost, [b] no significant improvement in information access for users, hence no change is user satisfaction with enterprise findability systems, and [c] time required to get the dunce cap off the “smart” and “intelligent” systems which are unfortunately disruptive to work flow.

Net net: There are some new players coming along in findability. I don’t think recycling old, manual concepts in plastic baloney wrappers will deliver the nutritional value needed to survive in today’s rigorous financial climate. To make clear how little technical progress has been made, I may dig out some old reports or early drafts of my more formal writings and post them on ArnoldIT.com on a page called “The Museum of Enterprise Search.” I will start Monday, October 31, 2011. Seems fitting.

Stephen E Arnold, October 30, 2011

Comments

4 Responses to “Software and Smart Content”

Ride2Esc » Software and Smart Content on October 30th, 2011 3:15 am

[…] post: Software and Smart Content « Previous911 WHaT iS YouR EMerGenCy – PoLiCe SHooTs CoP! | Health … Next […]
Software and Smart Content : Beyond Search | Algesr on October 30th, 2011 3:35 am

[…] more: Software and Smart Content : Beyond Search Tagged with: […]
Spotlight: Mindbreeze on the SharePoint Stage : Beyond Search on November 1st, 2011 12:11 am

[…] new feature, mentioned in the Beyond Search story “Software and Smart Content.” We will be taking a close look at some vendors. Some will be off the board; for example, […]
Turning Your Product Data into Smart Content on December 1st, 2011 12:30 pm

[…] was reading a Beyond Search article by Steve Arnold, Software and Smart Content. Steve is talking about the same problem I mentioned in the beginning of this blog post: too much […]

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.